Proteases

ABSTRACT

The invention provides human proteases (PRTS) and polynucleotides which identify and encode PRTS. The invention also provides expression vectors, host cells, antibodies, agonists, and antagonists. The invention also provides methods for diagnosing, treating, or preventing disorders associated with aberrant expression of PRTS.

TECHNICAL FIELD

[0001] This invention relates to nucleic acid and amino acid sequencesof proteases and to the use of these sequences in the diagnosis,treatment, and prevention of gastrointestinal, cardiovascular,autoimmune/inflammatory, cell proliferative, developmental, epithelial,neurological, and reproductive disorders, and in the assessment of theeffects of exogenous compounds on the expression of nucleic acid andamino acid sequences of proteases.

BACKGROUND OF THE INVENTION

[0002] Proteases cleave proteins and peptides at the peptide bond thatforms the backbone of the protein or peptide chain. Proteolysis is oneof the most important and frequent enzymatic reactions that occurs bothwithin and outside of cells. Proteolysis is responsible for theactivation and maturation of nascent polypeptides, the degradation ofmisfolded and damaged proteins, and the controlled turnover of peptideswithin the cell. Proteases participate in digestion, endocrine function,and tissue remodeling during embryonic development, wound healing, andnormal growth. Proteases can play a role in regulatory processes byaffecting the half life of regulatory proteins. Proteases are involvedin the etiology or progression of disease states such as inflammation,angiogenesis, tumor dispersion and metastasis, cardiovascular disease,neurological disease, and bacterial, parasitic, and viral infections.

[0003] Proteases can be categorized on the basis of where they cleavetheir substrates. Exopeptidases, which include aminopeptidases,dipeptidyl peptidases, tripeptidases, carboxypeptidases,peptidyl-di-peptidases, dipeptidases, and omega peptidases, cleaveresidues at the termini of their substrates. Endopeptidases, includingserine proteases, cysteine proteases, and metalloproteases, cleave atresidues within the peptide. Four principal categories of mammalianproteases have been identified based on active site structure,mechanaism of action, and overall three-dimensional structure. (SeeBeynon, R. J. and J. S. Bond (1994) Proteolytic Enzymes: A PracticalApproach, Oxford University Press, New York N.Y., pp. 1-5.)

[0004] Serin Proteases

[0005] The serine proteases (SPs) are a large, widespread family ofproteolytic enzyes that include the digestive enzymes trysin andchymotrypsin, components of the complement and blood-clotting cascades,and enzymes that control the degradation and turnover of macromoleculeswithin the cell and in the extracellular matrix Most of the more than 20subfamilies can be grouped into six clans, each with a common ancestor.These six clans are hypothesized to have descended from at least fourevolutionarily distinct ancestors. SPs are named for the presence of aserine residue found in the active catalytic site of most families. Theactive site is defined by the catalytic triad, a set of conservedasparagine, histidine, and serine residues critical for catalysis. Theseresidues form a charge relay network that facilitates substrate binding.Other residues outside the active site form an oxyanion hole thatstabilizes the tetrahedral transition intermediate formed duringcatalysis. SPs have a wide range of substrates and can be subdividedinto subfamilies on the basis of their substrate specificity. The mainsubfamilies are named for the residue(s) after which they cleave:trypases (after arginine or lysine), aspases (after aspartate), chymases(after phenylalanine or leucine), metases (methionine), and serases(after serine) (Rawlings, N. D. and A. J. Barrett (1994) MethodsEnzymol. 244:19-61).

[0006] Most mammalian serine proteases are synthesized as zymogens,inactive precursors that are activated by proteolysis. For example,trypsinogen is converted to its active form, trypsin, byenteropeptidase. Enteropeptidase is an intestinal protease that removesan N-terminal fragment from trypsinogen. The remaining active fragmentis trypsin, which in turn activates the precursors of the otherpancreatic enzymes. Likewise, proteolysis of prothrombin, the precursorof thrombin, generates three separate polypeptide fragments. TheN-ternninal fragment is released while the other two fragments, whichcomprise active thrombin, remain associated throug disulfide bonds.

[0007] The two largest SP subfamilies are the chymotrypsin (S1) andsubtilisin (S8) families. Some members of the chymotrypsin familycontain two structural domains unique to this family. Kringle domainsare triple-looped, disulfide cross-linked domains found in varying copynumber. Kringles are thought to play a role in binding mediators such asmembranes, other proteins or phospholipids, and in the regulation ofproteolytic activity (PROSITE PDOC00020). Apple domains are 90amino-acid repeated domains, each containing six conserved cysteines.Three disulfide bonds link the first and sixth, second and fifth, andthird and fourth cysteines (PROSITE PDOC00376). Apple domains areinvolved in protein-protein interactions. S1 family members includetrypsin, chymotrypsin, coagulation factors IX-XII, complement factors B,C, and D, granzymes, kallikrein, and tissue- and urokinase-plasminogenactivators. The subtilisin family has members found in the eubacteria,archaebacteria, eukaryotes, and viruses. Subtilisins include theproprotein-processing endopeptidases kexin and furin and the pituitaryprohormone convertases PC1, PC2, PC3, PC6, and PACE4 (Rawlings andBarrett, supra).

[0008] SPs have functions in many normal processes and some have beenimplicated in the etiology or treatment of disease. Enterokinase, theinitiator of intestinal digestion, is found in the intestinal brushborder, where it cleaves the acidic propeptide from trypsinogen to yieldactive trypsin (Kitamoto, Y. et al. (1994) Proc. Natl. Acad. Sci. USA91:7588-7592). Prolylcarboxypeptidase, a lysosomal serine peptidase thatcleaves peptides such as angiotensin II and III and [des-Arg9]bradykinin, shares sequence homology with members of both the serinecarboxypeptidase and prolylendopeptidase families (Tan, F. et al. (1993)J. Biol. Chem. 268:16631-16638). The protease neuropsin may influencesynapse formation and neuronal connectivity in the hippocampus inresponse to neural signaling (Chen, Z.-L. et al. (1995) J. Neurosci.15:5088-5097). Tissue plasminogen activator is useful for acutemanagement of stroke (Zivin, J. A. (1999) Neurology 53:14-19) andmyocardial infarction Ross, A. M. (1999) Clin. Cardiol. 22:165-171).Some receptors (PAR, for proteinase-activated receptor), highlyexpressed throughout the digestive tract, are activated by proteolyticcleavage of an extracellular domain. The major agonists for PARs,thrombin, trypsin, and mast cell tryptase, are released in allergy andinflammatory conditions. Control of PAR activation by proteases has beensuggested as a promising therapeutic target (Vergnolle, N. (2000)Aliment. Pharmacol. Ther. 14:257-266; Rice, K. D. et al. (1998) Curr.Pharm. Des. 4:381-396). Prostate-specific antigen (PSA) is akallikrein-like serine protease synthesized and secreted exclusively byepithelial cells in the prostate gland. Serum PSA is elevated inprostate cancer and is the most sensitive physiological marker formonitoring cancer progression and response to therapy. PSA can alsoidentify the prostate as the origin of a metastatic tumor (Brawer, M. K.and P. H. Lange (1989) Urology 33:11-16).

[0009] The signal peptidase is a specialized class of SP found in allprokaryotic and eukaryotic cell types that serves in the processing ofsignal peptides from certain proteins. Signal peptides areamino-terminal domains of a protein which direct the protein from itsribosomal assembly site to a particular cellular or extracellularlocation. Once the protein has been exported, removal of the signalsequence by a signal peptidase and posttranslational processing, e.g.,glycosylation or phosphorylation, activate the protein. Signalpeptidases exist as multi-subunit complexes in both yeast and mammals.The canine signal peptidase complex is composed of five subunits, allassociated with the microsomal membrane and containing hydrophobicregions that span the membrane one or more times (Shelness, G. S. and G.Blobel (1990) J. Biol. Chem. 265:9512-9519). Some of these subunitsserve to fix the complex in its proper position on the membrane whileothers contain the actual catalytic activity.

[0010] Another family of proteases which have a serine in their activesite are dependent on the hydrolysis of ATP for their activity. Theseproteases contain proteolytic core domains and regulatory ATPase domainswhich can be identified by the presence of the P-loop, anATP/GTP-binding motif (PROSITE PDC00803). Members of this family includethe eukaryotic mitochondrial matrix proteases, Clp protease and theproteasome. Clp protease was originally found in plant chloroplasts butis believed to be widespread in both prokaryotic and eukaryotic cells.The gene for early-onset torsion dystonia encodes a protein related toClp protease (Ozelius, L. J. et al. (1998) Adv. Neurol. 78:93-105).

[0011] The proteasome is an intracellular protease complex found in somebacteria and in all eukaryotic cells, and plays an important role incellular physiology. Proteasomes are associated with the ubiquitinconjugation system (UCS), a major pathway for the degradation ofcellular proteins of all types, including proteins that function toactivate or repress cellular processes such as transcription and cellcycle progression (Ciechanover, A. (1994) Cell 79:13-21). In the UCSpathway, proteins targeted for degradation are conjugated to ubiquitin,a small heat stable piotein. The ubiquitinated protein is thenrecognized and degraded by the proteasome. The resultantubiquitin-peptide complex is hydrolyzed by a ubiquitin carboxyl terminalhydrolase, and free ubiquitin is released for reutilization by the UCS.Ubiquitin-proteasome systems are implicated in the degradation ofmitotic cyclic kinases, oncoproteins, tumor suppressor genes (p53), cellsurface receptors associated with signal transduction, transcriptionalregulators, and mutated or damaged proteins (Ciechanover, supra). Thispathway has been implicated in a number of diseases, including cysticfibrosis, Angelman's syndrome, and Liddle syndrome (reviewed inSchwartz, A. L. and A. Ciechanover (1999) Annu. Rev. Med. 50:57-74). Amurine proto-oncogene, Unp, encodes a nuclear ubiquitin protease whoseoverexpression leads to oncogenic transformation of NIH3T3 cells. Thehuman homologue of this gene is consistently elevated in small celltumors and adenocarcinomas of the lung (Gray, D. A. (1995) Oncogene10:2179-2183). Ubiquitin carboxyl terminal hydrolase is involved in thedifferentiation of a lymphoblastic leukemia cell line to a non-dividingmature state (Maki, A. et al. (1996) Differentiation 60:59-66). Inneurons, ubiquitin carboxyl terminal hydrolase (PGP 9.5) expression isstrong in the abnormal structures that occur inhuman neurodegenerativediseases (Lowe, J. et al. (1990) J. Pathol. 161:153-160). The proteasomeis a large (˜2000 kDa) multisubunit complex composed of a centralcatalytic core containing a variety of proteases arranged in fourseven-membered rings with the active sites facing inwards into thecentral cavity, and terminal ATPase subunits covering the outer port ofthe cavity and regulating substrate entry (for review, see Schmidt, M.et al. (1999) Curr. Opin, Chem. Biol. 3:584-591).

[0012] Cysteine Proteases

[0013] Cysteine proteases (CPs) are involved in diverse cellularprocesses ranging from the processing of precursor proteins tointracellular degradation Nearly half of the CPs known are present onlyin viruses. CPs have a cysteine as the major catalytic residue at theactive site where catalysis proceeds via a thioester intermediate and isfacilitated by nearby histidine and asparagine residues. A glutamineresidue is also important, as it helps to form an oxyanion hole. Twoimportant CP families include the papain-like enzymes (C1) and thecalpains (C2). Papain-like family members are generally lysosomal orsecreted and therefore are synthesized with signal peptides as well aspropeptides. Most members bear a conserved motif in the propeptide thatmay have structural significance (Karrer, K. M. et al. (1993) Proc.Natl. Acad. Sci. USA 90:3063-3067). Three-dimensional structures ofpapain family members show a bilobed molecule with the catalytic sitelocated between the two lobes. Papains include cathepsins B, C, H, L,and S, certain plant allergens and dipeptidyl peptidase (for a review,see Rawlings, N. D. and A. J. Barrett (1994) Methods Enzymol.244:461-486).

[0014] Some CPs are expressed ubiquitously, while others are producedonly by cells of the immune system. Of particular note, CPs are producedby monocytes, macrophages and other cells which migrate to sites ofinflammation and secrete molecules involved in tissue repair.Overabundance of these repair molecules plays a role in certaindisorders. In autoimmune diseases such as rheumatoid arthritis,secretion of the cysteine peptidase cathepsin C degrades collagen,laminin, elastin and other structural proteins found in theextracellular matrix of bones. Bone weakened by such degradation is alsomore susceptible to tumor invasion and metastasis. Cathepsin Lexpression may also contribute to the influx of mononuclear cells whichexacerbates the destruction of the rheumatoid synovium (Keyszer, G. M.(1995) Arthritis Rheum. 38:976-984).

[0015] Calpains are calcium-dependent cytosolic endopeptidases whichcontain both an N-terminal catalytic domain and a C-terminalcalcium-binding domain. Calpain is expressed as a proenzyme heterodimerconsisting of a catalytic subunit unique to each isoform and aregulatory subunit common to different isoforms. Each subunit bears acalcium-binding EF-hand domain. The regulatory subunit also contains ahydrophobic glycine-rich domain that allows the enzyme to associate withcell membranes. Calpains are activated by increased intracellularcalcium concentration, which induces a change in conformation andlimited autolysis. The resultant active molecule requires a lowercalcium concentration for its activity (Chan, S. L. and M. P. Mattson(1999) J. Neurosci. Res. 58:167-190). Calpain expression ispredominantly neuronal, although it is present in other tissues. Severalchronic neurodegenerative disorders, including ALS, Parkinson's diseaseand Alzheimer's disease are associated with increased calpain expression(Chan and Mattson, supra). Calpain-mediated breakdown of thecytoskeleton has been proposed to contribute to brain damage resultingfrom head injury (McCracken, E. et al. (1999) J. Neurotrauma16:749-761). Calpain-3 is predominantly expressed in skeletal muscle,and is responsible for limb-girdle muscular dystrophy type 2A (Minami,N. et al. (1999) J. Neurol. Sci. 171:31-37).

[0016] Another family of thiol proteases is the caspases, which areinvolved in the initiation and execution phases of apoptosis. Apro-apoptotic signal can activate initiator caspases that trigger aproteolytic caspase cascade, leading to the hydrolysis of targetproteins and the classic apoptotic death of the cell. Two active siteresidues, a cysteine and a histidine, have been implicated in thecatalytic mechanism. Caspases are among the most specificendopeptidases, cleaving after aspartate residues. Caspases aresynthesized as inactive zymogens consisting of one large (p20) and onesmall (p10) subunit separated by a small spacer region, and a variableN-terminal prodomain. This prodomain interacts with cofactors that canpositively or negatively affect apoptosis. An activating signal causesautoproteolytic cleavage of a specific aspartate residue (D297 in thecaspase-1 numbering convention) and removal of the spacer and prodomain,leaving a p10/p20 heterodimer. Two of these heterodimers interact viatheir small subunits to form the catalytically active tetramer. The longprodomains of some caspase family members have been shown to promotedimerization and auto-processing of procaspases. Some caspases contain a“death effector domain” in their prodomain by which they can berecruited into self-activating complexes with other caspases and FADDprotein associated death receptors or the TNF receptor complex. Inaddition, two dimers from different caspase family members canassociate, changing the substrate specificity of the resultant tetramer.Endogenous caspase inhibitors (inhibitor of apoptosis proteins, or IAPs)also exist. All these interactions have clear effects on the control ofapoptosis (reviewed in Chan and Mattson, supra; Salveson, G. S. and V.M. Dixit (1999) Proc. Natl. Acad. Sci. USA 96:10964-10967).

[0017] Caspases have been implicated in a number of diseases. Micelacking some caspases have severe nervous system defects due to failedapoptosis in the neuroepithelium and suffer early lethality. Others showsevere defects in the inflammatory response, as caspases are responsiblefor processing IL-1b and possibly other inflammatory cytoldnes (Chan andMattson, supra). Cowpox virus and baculoviruses target caspases to avoidthe death of their host cell and promote successful infection. Inaddition, increases in inappropriate apoptosis have been reported inAIDS, neurodegenerative diseases and ischemic injury, while a decreasein cell death is associated with cancer (Salveson and Dixit, supra;Thompson, C. B. (1995) Science 267:1456-1462).

[0018] Aspartyl Proteases

[0019] Aspartyl proteases (APs) include the lysosomal proteasescathepsins D and E, as well as chymosin, renin, and the gastric pepsins.Most retroviruses encode an AP, usually as part of the pol polyprotein.APs, also called acid proteases, are monomeric enzymes consisting of twodomains, each domain containing one half of the active site with its owncatalytic aspartic acid residue. APs are most active in the range of pH2-3, at which one of the aspartate residues is ionized and the otherneutral. The pepsin family of APs contains many secreted enzymes, andall are likely to be synthesized with signal peptides and propeptides.Most family members have three disulfide loops, the first ˜5 residueloop following the first aspartate, the second 5-6 residue looppreceding the second aspartate, and the third and largest loop occuringtoward the C terminus. Retropepsins, on the other hand, are analogous toa single domain of pepsin, and become active as homodimers with eachretropepsin monomer contributing one half of the active site.Retropepsins are required for processing the viral polyproteins.

[0020] APs have roles in various tissues, and some have been associatedwith disease. Renin mediates the first step in processing the hormoneangiotensin, which is responsible for regulating electrolyte balance andblood pressure (reviewed in Crews, D. E. and S. R. Williams (1999) Hum.Biol. 71:475-503). Abnormal regulation and expression of cathepsins areevident in various inflanmatory disease states. Expression of cathepsinD is elevated in synovial tissues from patients with rheumatoidarthritis and osteoarthritis. The increased expression and differentialregulation of the cathepsins are linked to the metastatic potential of avariety of cancers (Chambers, A. F. et al. (1993) Crit. Rev. Oncol.4:95-114).

[0021] Metalloproteases

[0022] Metalloproteases require a metal ion for activity, usuallymanganese or zinc. Examples of manganese metalloenzymes includeaminopeptidase P and human proline dipeptidase (PEPD). Aminopeptidase Pcan degrade bradykinin, a nonapeptide activated in a variety ofinflammatory responses. Aminopeptidase P has been implicated in coronaryischemia/reperfasion injury. Administration of aminopeptidase Pinhibitors has been shown to have a cardioprotective effect in rats(Ersahin, C. et al. (1999) J. Cardiovasc. Pharmacol 34:604-611).

[0023] Most zinc-dependent metalloproteases share a common sequence inthe zinc-binding domain. The active site is made up of two histidineswhich act as zinc ligands and a catalytic glutamic acid C-terminal tothe first histidine. Proteins containing this signature sequence areknown as the metzincins and include aminopeptidase N,angiotensin-converting enzyme, neurolysin, the matrix metalloproteasesand the adamalysins (ADAMS). An alternate sequence is found in the zinccarboxypeptidases, in which all three conserved residues—two histidinesand a glutamic acid—are involved in zinc binding.

[0024] A number of the neutral metalloendopeptidases, includingangiotensin converting enzyme and the aminopeptidases, are involved inthe metabolism of peptide hormones. High atninopeptidase B activity, forexample, is found in the adrenal glands and neurohypophyses ofhypertensive rats (Prieto, I. et al. (1998) Horm. Metab. Res.30:246-248). Oligopeptidase M/neurolysin can hydrolyze bradykin as wellas neurotensin (Serizawa, A. et al. (1995) J. Biol. Chem 270:2092-2098).Neurotensin is a varoactive peptide that can act as a neurotransmitterin the brain, where it has been implicated in limiting food intake(Tritos, N. A. et al. (1999) Neuropeptides 33:339-349).

[0025] The matrix metalloproteases (MMPs) are a family of at least 23enzymes that can degrade components of the extracellular matrix (ECM).They are Zn⁺² endopeptidases with an N-terminal catalytic domain. Nearlyall members of the family have a hinge peptide and C-terminal domainwhich can bind to substrate molecules in the ECM or to inhibitorsproduced by the tissue (TIMPs, for tissue inhibitor of metalloprotease;Campbell, I. L. et al. (1999) Trends Neurosci. 22.285). The presence offibronectin-like repeats, trantuembrane domains, or C-terminalhemopexinase-like domains can be used to separate MMPs into collagenase,gelatinase, stromelysin and membrane-type MMP subfamilies. In theinactive form, the Zn⁺² ion in the active site interacts with a cysteinein the pro-sequence. Activating factors disrupt the Zn⁺²-cysteineinteraction, or “cysteine switch,” exposing the active site. Thispartially activates the enzyme, which then cleaves off its propeptideand becomes fully active. MMPs are often activated by the serineproteases plasmin and furin. MMPs are often regulated by stoichiometric,noncovalent interactions with inhibitors; the balance of protease toinhibitor, then, is very important in tissue homeostasis (reviewed inYong, V. W. et al. (1998) Trends Neurosci. 21:75).

[0026] MMPs are implicated in a number of diseases includingosteoarthritis (Mitchell, P. et al. (1996) J. Clin. Invest. 97:761),atherosclerotic plaque rupture (Sukhova, G. K. et al. (1999) Circulation99:2503), aortic aneurysm (Schneiderman, J. et al. (1998) Am. J. Path.152:703), non-healing wounds (Saarialho-Kere, U. K. et al. (1994) J.Clin. Invest. 94:79), bone resorption (Blavier, L. and J. M. Delaisse(1995) J. Cell Sci. 108:3649), age-related macular degeneration (Steen,B. et al. (1998) Invest. Ophthalmol. Vis. Sci. 39:2194), emphysemaFinlay, G. A. et al. (1997) Thorax 52:502), myocardial infarction(Rohde, L. E. et al. (1999) Circulation 99:3063) and dilatedcardiomyopathy (Thomas, C. V. et al. (1998) Circulation 97:1708). MMPinibitors prevent metastasis of mammary carcinoma and experimentaltumors in rat, and Lewis lung carcinoma, hemangioma, and human ovariancarcinoma xenografts in mice (Eccles, S. A. et al. (1996) Cancer Res.56:2815; Anderson et al. (1996) Cancer Res. 56:715-718; Volpert, O. V.et al. (1996) J. Clin. Invest. 98:671; Taraboletti, G. et al. (1995) J.NCI 87:293; Davies, B. et al. (1993) Cancer Res. 53:2087). MMPs may beactive in Alzheimer's disease. A number of MMPs are implicated inmultiple sclerosis, and administration of MMP inhbitors can relieve someof its symptoms (reviewed in Yong, supra).

[0027] Another family of metalloproteases is the ADAMs, for ADisintegrin and Metalloprotease Domain, which they share with theirclose relatives the adamalysins, snake venom metalloproteases (SVMPs).ADAMs combine features of both cell surface adhesion molecules andproteases, containing a prodomain, a protease domain, a disintegrindomain, a cysteine rich domain, an epidermal growth factor repeat, atransmembrane domain, and a cytoplasmic tail. The first three domainslisted above are also found in the SVPs. The ADAMs possess fourpotential functions: proteolysis, adhesion, signaling and fusion. TheADAMs share the metzincin zinc binding sequence and are inhibited bysome MMP antagonists such as TIMP-1.

[0028] ADAMs are implicated in such processes as sperm-egg binding andfusion, myoblast fusion, and protein-ectodomain processing or sheddingof cytokines, cytokine receptors, adhesion proteins and otherextracellular protein domains (Schlöndorff, J. and C. P. Blobel (1999)J. Cell. Sci. 112:3603-3617). The Kuzbanian protein cleaves a substratein the NOTCH pathway (possibly NOTCH itself), activating the program forlateral inhibition in Drosophila neural development. Two ADAMs, TACE(ADAM 17) and ADAM 10, are proposed to have analogous roles in theprocessing of amyloid precursor protein in the brain (Schlöndorff andBlobel, supra). TACE has also been identified as the TNF activatingenzyme (Black, R. A. et al. (1997) Nature 385:729). TNF is a pleiotropiccytokine that is important in mobilizing host defenses in response toinfection or trauma, but can cause severe damage in excess and is oftenoverproduced in autoimmune disease. TACE cleaves membrane-bound pro-TNFto release a soluble form. Other ADAMs may be involved in a similar typeof processing of other membrane-bound molecules.

[0029] The ADAMTS sub-family has all of the features of ADAM familymetalloproteases and contain an additional thrombospondin domain (TS).The prototypic ADAMTS was identified in mouse, found to be expressed inheart and kidney and upregulated by proinflammatory stimuli (Kuno, K etal. (1997) J. Biol. Chem. 272:556-562). To date eleven members arerecognized by the Human Genome Organization (HUGO;http://www.gene.ucl.ac.uk/users/hester/adamts.html#Approved). Members ofthis family have the ability to degrade aggrecan, a high molecularweight proteoglycan which provides cartilage with important mechanicalproperties including compressibility, and which is lost during thedevelopment of arthritis. Enzymes which degrade aggrecan are thusconsidered attractive targets to prevent and slow the degradation ofarticular cartilage (See, e.g., Tortorella, M. D. (1999) Science284:1664; Abbaszade, I. (1999) J. Biol. Chem. 274:23443). Other membersare reported to have antiangiogenic potential (Kuno et al., supra)and/or procollagen processing (Colige, A. et al. (1997) Proc. Natl.Acad. Sci. USA 94:2374).

[0030] The discovery of new proteases, and the polynucleotides encodingthem, satisfies a need in the art by providing new compositions whichare useful in the diagnosis, prevention, and treatment ofgastrointestinal, cardiovascular, autoimmune/inflammatory, cellproliferative, developmental, epithelial, neurological, and reproductivedisorders, and in the assessment of the effects of exogenous compoundson the expression of nucleic acid and amino acid sequences of proteases.

SUMMARY OF THE INVENTION

[0031] The invention features purified polypeptides, proteases, referredto collectively as “PRTS” and individually as “PRTS-1,”“PRTS-2,”“PRTS-3,” “PRTS4,” “PRTS-5,” “PRTS-6,” “PRTS-7,”“PRTS-8,”“PRTS-9,” “PRTS-10,” “PRTS-11,” “PRTS-12,” “PRTS-13, ” “PRTS-14,”“PRTS-15,” “PRTS-16,” and “PRTS-17.” In one aspect, the inventionprovides an isolated polypeptide selected from the group consisting ofa) a polypeptide comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO: 1-17, b) a polypeptide comprising anaturally occurring amino acid sequence at least 90% identical to anamino acid sequence selected from the group consisting of SEQ ID NO:1-17, c) a biologically active fragment of a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO: 1-17, andd) an immunogenic fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17. In onealternative, the invention provides an isolated polypeptide comprisingthe amino acid sequence of SEQ ID NO: 1-17.

[0032] The invention further provides an isolated polynucleotideencoding a polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17, and d)an immunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17. In onealternative, the polynucleotide encodes a polypeptide selected from thegroup consisting of SEQ ID NO: 1-17. In another alternative, thepolynucleotide is selected from the group consisting of SEQ ID NO:18-34.

[0033] Additionally, the invention provides a recombinant polynucleotidecomprising a promoter sequence operably linked to a polynucleotideencoding a polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17, and d)an immunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17. In onealternative, the invention provides a cell transformed with therecombinant polynucleotide. In another alternative, the inventionprovides a transgenic organism comprising the recombinantpolynucleotide.

[0034] The invention also provides a method for producing a polypeptideselected from the group consisting of a) a polypeptide comprising anamino acid sequence selected from the group consisting of SEQ ID NO:1-17, b) a polypeptide comprising a naturally occurring amino acidsequence at least 90% identical to an amino acid sequence selected fromthe group consisting of SEQ ID NO: 1-17, c) a biologically activefragment of a polypeptide having an amino acid sequence selected fromthe group consisting of SEQ ID NO: 1-17, and d) an immunogenic fragmentof a polypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17. The method comprises a) culturing a cellunder conditions suitable for expression of the polypeptide, whereinsaid cell is transformed with a recombinant polynucleotide comprising apromoter sequence operably linked to a polynucleotide encoding thepolypeptide, and b) recovering the polypeptide so expressed.

[0035] Additionally, the invention provides an isolated antibody whichspecifically binds to a polypeptide selected from the group consistingof a) a polypeptide comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO: 1-17, b) a polypeptide comprising anaturally occurring amino acid sequence at least 90% identical to anamino acid sequence selected from the group consisting of SEQ ID NO:1-17, c) a biologically active fragment of a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO: 1-17, andd) an immunogenic fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17.

[0036] The invention further provides an isolated polynucleotideselected from the group consisting of a) a polynucleotide comprising apolynucleotide sequence selected from the group consisting of SEQ ID NO:18-34, b) a polynucleotide comprising a naturally occurringpolynucleotide sequence at least 90% identical to a polynucleotidesequence selected from the group consisting of SEQ ID NO: 18-34, c) apolynucleotide complementary to the polynucleotide of a), d) apolynucleotide complementary to the polynucleotide of b), and e) an RNAequivalent of a)-d). In one alternative, the polynucleotide comprises atleast 60 contiguous nucleotides.

[0037] Additionally, the invention provides a method for detecting atarget polynucleotide in a sample, said target polynucleotide having asequence of a polynucleotide selected from the group consisting of a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO: 18-34, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ ID NO:1 8-34, c) a polynucleotide complementary to the polynucleotide of a),d) a polynucleotide complementary to the polynucleotide of b), and e) anRNA equivalent of a)-d). The method comprises a) hybridizing the samplewith a probe comprising at least 20 contiguous nucleotides comprising asequence complementary to said target polynucleotide in the sample, andwhich probe specifically hybridizes to said target polynucleotide, underconditions whereby a hybridization complex is formed between said probeand said target polynucleotide or fragments thereof, and b) detectingthe presence or absence of said hybridization complex, and optionally,if present, the amount thereof. In one alternative, the probe comprisesat least 60 contiguous nucleotides.

[0038] The invention further provides a method for detecting a targetpolynucleotide in a sample, said target polynucleotide having a sequenceof a polynucleotide selected from the group consisting of a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO: 18-34, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ ID NO:18-34, c) a polynucleotide complementary to the polynucleotide of a), d)a polynucleotide complementary to the polynucleotide of b), and e) anRNA equivalent of a)-d). The method comprises a) amplifUing said targetpolynucleotide or fragment thereof using polymerase chain reactionamplification, and b) detecting the presence or absence of saidamplified target polynucleotide or fragment thereof, and, optionally, ifpresent, the amount thereof.

[0039] The invention further provides a composition comprising aneffective amount of a polypeptide selected from the group consisting ofa) a polypeptide comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO: 1-17, b) a polypeptide comprising anaturally occurring amino acid sequence at least 90% identical to anamino acid sequence selected from the group consisting of SEQ ID NO:1-17, c) a biologically active fragment of a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO: 1-17, andd) an immunogenic fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17, and apharmaceutically acceptable excipient. In one embodiment, thecomposition comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17. The invention additionally provides amethod of treating a disease or condition associated with decreasedexpression of functional PRTS, comprising administering to a patient inneed of such treatment the composition.

[0040] The invention also provides a method for screening a compound foreffectiveness as an agonist of a polypeptide selected from the groupconsisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO: 1-17, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO: 1-17, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ ID NO:1-17. The method comprises a) exposing a sample comprising thepolypeptide to a compound, and b) detecting agonist activity in thesample. In one alternative, the invention provides a compositioncomprising an agonist compound identified by the method and apharmaceutically acceptable excipient. In another alternative, theinvention provides a method of treating a disease or conditionassociated with decreased expression of functional PRTS, comprisingadministering to a patient in need of such treatment the composition.

[0041] Additionally, the invention provides a method for screening acompound for effectiveness as an antagonist of a polypeptide selectedfrom the group consisting of a) a polypeptide comprising an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17, b) apolypeptide comprising a naturally occurring amino acid sequence atleast 90% identical to an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17, c) a biologically active fragment of apolypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17, and d) an immunogenic fragment of apolypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17. The method comprises a) exposing a samplecomprising the polypeptide to a compound, and b) detecting antagonistactivity in the sample. In one alternative, the invention provides acomposition comprising an antagonist compound identified by the methodand a pharmaceutically acceptable excipient. In another alternative, theinvention provides a method of treating a disease or conditionassociated with overexpression of functional PRTS, comprisingadministering to a patient in need of such treatnent the composition.

[0042] The invention further provides a method of screening for acompound that specifically binds to a polypeptide selected from thegroup consisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO: 1-17, c) abiologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO: 1-17, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ ID NO:1-17. The method comprises a) combining the polypeptide with at leastone test compound under suitable conditions, and b) detecting binding ofthe polypeptide to the test compound, thereby identifying a compoundthat specifically binds to the polypeptide.

[0043] The invention further provides a method of screening for acompound that modulates the activity of a polypeptide selected from thegroup consisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO: 1-17, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO: 1-17, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ ID NO:1-17. The method comprises a) combining the polypeptide with at leastone test compound under conditions permissive for the activity of thepolypeptide, b) assessing the activity of the polypeptide in thepresence of the test compound, and c) comparing the activity of thepolypeptide in the presence of the test compound with the activity ofthe polypeptide in the absence of the test compound, wherein a change inthe activity of the polypeptide in the presence of the test compound isindicative of a compound that modulates the activity of the polypeptide.

[0044] The invention further provides a method for screening a compoundfor effectiveness in altering expression of a target polynucleotide,wherein said target polynucleotide comprises a polynucleotide sequenceselected from the group consisting of SEQ ID NO: 18-34, the methodcomprising a) exposing a sample comprising the target polynucleotide toa compound, and b) detecting altered expression of the targetpolynucleotide.

[0045] The invention further provides a method for assessing toxicity ofa test compound, said method comprising a) treating a biological samplecontaining nucleic acids with the test compound; b) hybridizing thenucleic acids of the treated biological sample with a probe comprisingat least 20 contiguous nucleotides of a polynucleotide selected from thegroup consisting of i) a polynucleotide comprising a polynucleotidesequence selected from the group consisting of SEQ ID NO: 18-34, ii) apolynucleotide comprising a naturally occurring polynucleotide sequenceat least 90% identical to a polynucleotide sequence selected from thegroup consisting of SEQ ID NO: 18-34, iii) a polynucleotide having asequence complementary to i), iv) a polynucleotide complementary to thepolynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridizationoccurs under conditions whereby a specific hybridization complex isformed between said probe and a target polynucleotide in the biologicalsample, said target polynucleotide selected from the group consisting ofi) a polynucleotide comprising, a polynucleotide sequence selected fromthe group consisting of SEQ ID NO: 18-34, ii) a polynucleotidecomprising a naturally occurring polynucleotide sequence at least 90%identical to a polynucleotide sequence selected from the groupconsisting of SEQ ID NO: 18-34, iii) a polynucleotide complementary tothe polynucleotide of i), iv) a polynucleotide complementary to thepolynucleotide of ii), and v) an RNA equivalent of i)-iv).Alternatively, the target polynucleotide comprises a fragment of apolynucleotide sequence selected from the group consisting of i)-v)above; c) quantifying the amount of hybridization complex; and d)comparing the amount of hybridization complex in the treated biologicalsample with the amount of hybridization complex in an untreatedbiological sample, wherein a difference in the amount of hybridizationcomplex in the treated biological sample is indicative of toxicity ofthe test compound.

BRIEF DESCRIPTION OF THE TABLES

[0046] Table 1 summarizes the nomenclature for the full lengthpolynucleotide and polypeptide sequences of the present invention.

[0047] Table 2 shows the GenBank identification number and annotation ofthe nearest GenBank homolog for polypeptides of the invention. Theprobability score for the match between each polypeptide and its GenBankhomolog is also shown.

[0048] Table 3 shows structural features of polypeptide sequences of theinvention, including predicted motifs and domains, along with themethods, algorithms, and searchable databases used for analysis of thepolypeptides.

[0049] Table 4 lists the cDNA and/or genomic DNA fragments which wereused to assemble polynucleotide sequences of the invention, along withselected fragments of the polynucleotide sequences.

[0050] Table 5 shows the representative cDNA library for polynucleotidesof the invention.

[0051] Table 6 provides an appendix which describes the tissues andvectors used for construction of the cDNA libraries shown in Table 5.

[0052] Table 7 shows the tools, programs, and algorithms used to analyzethe polynucleotides and polypeptides of the invention, along withapplicable descriptions, references, and threshold parameters.

DESCRIPTION OF THE INVENTION

[0053] Before the present proteins, nucleotide sequences, and methodsare described, it is understood that this invention is not limited tothe particular machines, materials and methods described, as these mayvary. It is also to be understood that the terminology used herein isfor the purpose of describing particular embodiments only, and is notintended to limit the scope of the present invention which will belimited only by the appended claims.

[0054] It must be noted that as used herein and in the appended claims,the singular forms “a,” “an,” and “the” include plural reference unlessthe context clearly dictates otherwise. Thus, for example, a referenceto “a host cell” includes a plurality of such host cells, and areference to “an antibody” is a reference to one or more antibodies andequivalents thereof known to those skilled in the art, and so forth.

[0055] Unless defined otherwise, all technical and scientific terms usedherein have the same meanings as commonly understood by one of ordinaryskill in the art to which this invention belongs. Although any machines,materials, and methods similar or equivalent to those described hereincan be used to practice or test the present invention, the preferredmachines, materials and methods are now described. All publicationsmentioned herein are cited for the purpose of describing and disclosingthe cell lines, protocols, reagents and vectors which are reported inthe publications and which might be used in connection with theinvention. Nothing herein is to be construed as an admission that theinvention is not entitled to antedate such disclosure by virtue of priorinvention.

Definitions

[0056] “PRTS” refers to the amino acid sequences of substantiallypurified PRTS obtained from any species, particularly a mammalianspecies, including bovine, ovine, porcine, murine, equine, and human andfrom any source, whether natural, synthetic, semi-synthetic, orrecombinant

[0057] The term “agonist” refers to a molecule which intensifies ormimics the biological activity of PRTS. Agonists may include proteins,nucleic acids, carbohydrates, small molecules, or any other compound orcomposition which modulates the activity of PRTS either by directlyinteracting with PRTS or by acting on components of the biologicalpathway in which PRTS participates.

[0058] An “allelic variant” is an alternative form of the gene encodingPRTS. Allelic variants may result from at least one mutation in thenucleic acid sequence and may result in altered mRNAs or in polypeptideswhose structure or function may or may not be altered. A gene may havenone, one, or many allelic variants of its naturally occurring form.Common mutational changes which give rise to allelic variants aregenerally ascribed to natural deletions, additions, or substitutions ofnucleotides. Each of these types of changes may occur alone, or incombination with the others, one or more times in a given sequence.

[0059] “Altered” nucleic acid sequences encoding PRTS include thosesequences with deletions, insertions, or substitutions of differentnucleotides, resulting in a polypeptide the same as PRTS or apolypeptide with at least one functional characteristic of PRTS.Included within this definition are polymorphisms which may or may notbe readily detectable using a particular oligonucleotide probe of thepolynucleotide encoding PRTS, and improper or unexpected hybridizationto allelic variants, with a locus other than the normal chromosomallocus for the polynucleotide sequence encoding PRTS. The encoded proteinmay also be “altered,” and may contain deletions, insertions, orsubstitutions of amino acid residues which produce a silent change andresult in a functionally equivalent PRTS. Deliberate amino acidsubstitutions may be made on the basis of similarity in polarity,charge, solubility, hydrophobicity, hydrophilicity, and/or theamphipathic nature of the residues, as long as the biological orimmunological activity of PRTS is retained. For example, negativelycharged amino acids may include aspartic acid and glutamic acid, andpositively charged amino acids may include lysine and arginine. Aminoacids witlh uncharged polar side chains having similar hydrophilicityvalues may include: asparagine and glutamine; and serine and threonine.Amino acids with uncharged side chains having similar hydrophilicityvalues may include: leucine, isoleucine, and valine; glycine andalanine; and phenylalanine and tyrosine.

[0060] The terms “amino acid” and “amino acid sequence” refer to anoligopeptide, peptide, polypeptide, or protein sequence, or a fragmentof any of these, and to naturally occurring or synthetic molecules.Where “amino acid sequence” is recited to refer to a sequence of anaturally occurring protein molecule, “amino acid sequence” and liketerms are not meant to limit the amino acid sequence to the completenative amino acid sequence associated with the recited protein molecule.

[0061] “Amplification” relates to the production of additional copies ofa nucleic acid sequence. Amplification is generally carried out usingpolymerase chain reaction (PCR) technologies well known in the art.

[0062] The term “antagonist” refers to a molecule which inhibits orattenuates the biological activity of PRTS. Antagonists may includeproteins such as antibodies, nucleic acids, carbohydrates, smallmolecules, or any other compound or composition which modulates theactivity of PRTS either by directly interacting with PRTS or by actingon components of the biological pathway in which PRTS participates.

[0063] The term “antibody” refers to intact immunoglobulin molecules aswell as to fragments thereof, such as Fab, F(ab′)₂, and Fv fragments,which are capable of binding an epitopic determinant. Antibodies thatbind PRTS polypeptides can be prepared using intact polypeptides orusing fragments containing small peptides of interest as the immunizingantigen. The polypeptide or oligopeptide used to immunize an animal(e.g., a mouse, a rat, or a rabbit) can be derived from the translationof RNA, or synthesized chemically, and can be conjugated to a carrierprotein if desired. Commonly used carriers that are chemically coupledto peptides include bovine serum albumin, thyroglobulin, and keyholelimpet hemocyanin (KLH). The coupled peptide is then used to immunizethe animal

[0064] The term “antigenic determinant” refers to that region of amolecule (i.e., an epitope) that makes contact with a particularantibody. When a protein or a fragment of a protein is used to immunizea host animal, numerous regions of the protein may induce the productionof antibodies which bind specifically to antigenic determinants(particular regions or three dimensional structures on the protein). Anantigenic determinant may compete with the intact antigen (i.e., theimmunogen used to elicit the immune response) for binding to anantibody.

[0065] The term “antisense” refers to any composition capable ofbase-pairng with the “sense” (coding) strand of a specific nucleic acidsequence. Antisense compositions may include DNA; RNA; peptide nucleicacid (PNA); oligonucleotides having modified backbone linkages such asphosphorothioates, methylphosphonates, or benzylphosphonates;oligonucleotides having modified sugar groups such as 2′-methoxyethylsugars or 2′-methoxyethoxy sugars; or oligonucleotides having modifiedbases such as 5-methyl cytosine, 2′-deoxyuracil, or7-deaza-2′-deoxyguanosine. Antisense molecules may be produced by anymethod including chemical synthesis or transcription. Once introducedinto a cell, the complementary antisense molecule base-pairs with anaturally occurring nucleic acid sequence produced by the cell to formduplexes which block either transcription or translation. Thedesignation “negative” or “minus” can refer to the antisense strand, andthe designation “positive” or “plus” can refer to the sense strand of areference DNA molecule.

[0066] The term “biologically active” refers to a protein havingstructural, regulatory, or biochemical functions of a naturallyoccurring molecule. likewise, “immunologically active” or “immunogenic”refers to the capability of the natural, recombinant, or synthetic PRTS,or of any oligopeptide thereof, to induce a specific immune response inappropriate animals or cells and to bind with specific antibodies.

[0067] “Complementary” describes the relationship between twosingle-stranded nucleic acid sequences that anneal by base-pairing. Forexample, 5′-AGT-3′ pairs with its complement, 3′-TCA-5′.

[0068] A “composition comprising a given polynucleotide sequence” and a“composition comprising a given amino acid sequence” refer broadly toany composition containing the given polynucleotide or amino acidsequence. The composition may comprise a dry formulation or an aqueoussolution. Compositions comprising polynucleotide sequences encoding PRTSor fragments of PRTS may be employed as hybridization probes. The probesmay be stored in freeze-dried form and may be associated with astabilizing agent such as a carbohydrate. In hybridizations, the probemay be deployed in an aqueous solution containing salts (e.g., NaCl),detergents (e.g., sodium dodecyl sulfate; SDS), and other components(e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.).

[0069] “Consensus sequence” refers to a nucleic acid sequence which hasbeen subjected to repeated DNA sequence analysis to resolve uncalledbases, extended using the XL-PCR kit (Applied Biosystems, Foster City,Calif.) in the 5′ and/or the 3′ direction, and resequenced, or which hasbeen assembled from one or more overlapping cDNA, EST, or genomic DNAfragments using a computer program for fragment assembly, such as theGELVIEW fragment assembly system (GCG, Madison Wis.) or Fhrap(University of Washington, Seattle Wash.). Some sequences have been bothextended and assembled to produce the consensus sequence.

[0070] “Conservative amino acid substitutions” are those substitutionsthat are predicted to least interfere with the properties of theoriginal protein, i.e., the structure and especially the function of theprotein is conserved and not significantly changed by suchsubstitutions. The table below shows amino acids which may besubstituted for an original amino acid in a protein and which areregarded as conservative amino acid substitutions. Original ResidueConservative Substitution Ala Gly, Ser Arg His, Lys Asn Asp, Gln, HisAsp Asn, Glu Cys Ala, Ser Gln Asn, Glu, His Glu Asp, Gln, His Gly AlaHis Asn, Arg, Gln, Glu Ile Leu, Val Leu Ile, Val Lys Arg, Gln, Glu MetLeu, Ile Phe His, Met, Leu, Trp, Tyr Ser Cys, Thr Thr Ser, Val Trp Phe,Tyr Tyr His, Phe, Trp Val Ile, Leu, Thr

[0071] Conservative amino acid substitutions generally maintain (a) thestructure of the polypeptide backbone in the area of the substitution,for example, as a beta sheet or alpha helical conformation, (b) thecharge or hydrophobicity of the molecule at the site of thesubstitution, and/or (c) the bulk of the side chain.

[0072] A “deletion” refers to a change in the amino acid or nucleotidesequence that results in the absence of one or more amino acid residuesor nucleotides.

[0073] The term “derivative” refers to a chemically modifiedpolynucleotide or polypeptide. Chemical modifications of apolynucleotide can include, for example, replacement of hydrogen by analkyl, acyl, hydroxyl, or amino group. A derivative polynucleotideencodes a polypeptide which retains at least one biological orimmunological function of the natural molecule. A derivative polypeptideis one modified by glycosylation, pegylation, or any similar processthat retains at least one biological or immunological function of thepolypeptide from which it was derived.

[0074] A “detectable label” refers to a reporter molecule or enzyme thatis capable of generating a measurable signal and is covalently ornoncovalently joined to a polynucleotide or polypeptide.

[0075] “Differential expression” refers to increased or upregulated; ordecreased, downregulated, or absent gene or protein expression,determined by comparing at least two different samples. Such comparisonsmay be carried out between, for example, a treated and an untreatedsample, or a diseased and a normal sample.

[0076] “Exon shuffling” refers to the recombination of different codingregions (exons). Since an exon may represent a structural or functionaldomain of the encoded protein, new proteins may be assembled through thenovel reassortment of stable substructures, thus allowing accelerationof the evolution of new protein functions.

[0077] A “fragment” is a unique portion of PRTS or the polynucleotideencoding PRTS which is identical in sequence to but shorter in lengththan the parent sequence. A fragment may comprise up to the entirelength of the defined sequence, minus one nucleotide/arnino acidresidue. For example, a fragment may comprise from 5 to 1000 contiguousnucleotides or amino acid residues. A fragment used as a probe, primer,antigen, therapeutic molecule, or for other purposes, maybe at least 5,10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500contiguous nucleotides or amino acid residues in length Fragments may bepreferentially selected from certain regions of a molecule. For example,a polypeptide fragment may comprise a certain length of contiguous aminoacids selected from the first 250 or 500 amino acids (or first 25% or50%) of a polypeptide as shown in a certain defined sequence. Clearlythese lengths are exemplary, and any length that is supported by thespecification, including the Sequence Listing, tables, and figures, maybe encompassed by the present embodiments.

[0078] A fragment of SEQ ID NO: 18-34 comprises a region of uniquepolynucleotide sequence that specifically identifies SEQ ID NO: 18-34,for example, as distinct from any other sequence in the genome fromwhich the fragment was obtained. A fragment of SEQ ID NO: 18-34 isuseful, for example, in hybridization and amplification technologies andin analogous methods that distinguish SEQ ID NO: 18-34 from relatedpolynucleotide sequences. The precise length of a fragment of SEQ ID NO:18-34 and the region of SEQ ID NO: 18-34 to which the fragmentcorresponds are routinely determinable by one of ordinary skill in theart based on the intended purpose for the fragment

[0079] A fragment of SEQ ID NO: 1-17 is encoded by a fragment of SEQ IDNO: 18-34. A fragment of SEQ ID NO: 1-17 comprises a region of uniqueamino acid sequence that specifically identifies SEQ ID NO: 1-17. Forexample, a fragment of SEQ ID NO: 1-17 is useful as an immunogenicpeptide for the development of antibodies that specifically recognizeSEQ ID NO: 1-17. The precise length of a fragment of SEQ ID NO: 1-17 andthe region of SEQ ID NO: 1-17 to which the fragment corresponds areroutinely determinable by one of ordinary skill in the art based on theintended purpose for the fragment.

[0080] A “full length” polynucleotide sequence is one containing atleast a translation initiation codon (e.g., methionine) followed by anopen reading frame and a translation termination codon. A “full length”polynucleotide sequence encodes a “full length” polypeptide sequence.

[0081] “Homology” refers to sequence similarity or, interchangeably,sequence identity, between two or more polynucleotide sequences or twoor more polypeptide sequences.

[0082] The terms “percent identity” and “% identity,” as applied topolynucleotide sequences, refer to the percentage of residue matchesbetween at least two polynucleotide sequences aligned using astandardized algorithm. Such an algorithm may insert, in a standardizedand reproducible way, gaps in the sequences being compared in order tooptimize alignment between two sequences, and therefore achieve a moremeaningful comparison of the two sequences.

[0083] Percent identity between polynucleotide sequences may bedetermined using the default parameters of the CLUSTAL V algorithm asincorporated into the NEGAUGN version 3.12e sequence alignment program.This program is part of the LASERGENE software package, a suite ofmolecular biological analysis programs (DNASTAR, Madison Wis.). CLUSTALV is described in Higgins, D. G. and P. M. Sharp (1989) CABIOS 5:151-153and in Higgins, D. G. et al. (1992) CABIOS 8:189-191. For pairwisealignments of polynucleotide sequences, the default parameters are setas follows: Ktuple=2, gap penalty=5, window=4, and “diagonals saved”=4.The “weighted” residue weight table is selected as the default. Percentidentity is reported by CLUSTAL V as the “percent similarity” betweenaligned polynucleotide sequences.

[0084] Alternatively, a suite of commonly used and freely availablesequence comparison algorithms is provided by the National Center forBiotechnology Information (NCBI) Basic Local Alignment Search Tool(BLAST) (Altschul, S. F. et al. (1990) J. Mol. Biol. 215:403-410), whichis available from several sources, including the NCBI, Bethesda, Md.,and on the Internet at http://www.ncbi.nlm.nih.gov/BLAST/. The BLASTsoftware suite includes various sequence analysis programs including“blastn,” that is used to align a known polynucleotide sequence withother polynucleotide sequences from a variety of databases. Alsoavailable is a tool called “BLAST 2 Sequences” that is used for directpairwise comparison of two nucleotide sequences. “BLAST 2 Sequences” canbe accessed and used interactively athttp./www.ncbi.nlm.nih.gov/gorf/bl2.html. The “BLAST 2 Sequences” toolcan be used for both blastn and blastp (discussed below). BLAST programsare commonly used with gap and other parameters set to default settings.For example, to compare two nucleotide sequences, one may use blastnwith the “BLAST 2 Sequences” tool Version 2.0.12 (Apr. 21, 2000) set atdefault parameters. Such default parameters may be, for example:

[0085] Matrix: BLOSUM62

[0086] Reward for match: 1

[0087] Penalty for mismatch: −2

[0088] Open Gap: 5 and Extension Gap: 2 penalties

[0089] Gap x drop-off: 50

[0090] Expect: 10

[0091] Word Size: 11

[0092] Filter: on

[0093] Percent identity may be measured over the length of an entiredefined sequence, for example, as defined by a particular SEQ ID number,or may be measured over a shorter length, for example, over the lengthof a fragment taken from a larger, defined sequence, for instance, afragment of at least 20, at least 30, at least 40, at least 50, at least70, at least 100, or at least 200 contiguous nucleotides. Such lengthsare exemplary only, and it is understood that any fragment lengthsupported by the sequences shown herein, in the tables, figures, orSequence Listing, may be used to describe a length over which percentageidentity may be measured.

[0094] Nucleic acid sequences that do not show a high degree of identitymay nevertheless encode similar amino acid sequences due to thedegeneracy of the genetic code. It is understood that changes in anucleic acid sequence can be made using this degeneracy to producemultiple nucleic acid sequences that all encode substantially the sameprotein.

[0095] The phrases “percent identity” and “% identity,” as applied topolypeptide sequences, refer to the percentage of residue matchesbetween at least two polypeptide sequences aligned using a standardizedalgorithm. Methods of polypeptide sequence alignment are well-known.Some alignment methods take into account conservative amino acidsubstitutions. Such conservative substitutions, explained in more detailabove, generally preserve the charge and hydrophobicity at the site ofsubstitution, thus preserving the structure (and therefore function) ofthe polypeptide.

[0096] Percent identity between polypeptide sequences may be determinedusing the default parameters of the CLUSTAL V algorithm as incorporatedinto the MEGALIGN version 3.12e sequence alignment program (describedand referenced above). For pairwise alignments of polypeptide sequencesusing CLUSTAL V, the default parameters are set as follows: Ktuple=1,gap penalty=3, window=5, and “diagonals saved”=5. The PAM250 matrix isselected as the default residue weight table. As with polynucleotidealignments, the percent identity is reported by CLUSTAL V as the“percent similarity” between aligned polypeptide sequence pairs.

[0097] Alternatively the NCBI BLAST software suite may be used. Forexample, for a pairwise comparison of two polypeptide sequences, one mayuse the “BLAST 2 Sequences” tool Version 2.0.12 (Apr. 21, 2000) withblastp set at default parameters. Such default parameters may be, forexample:

[0098] Matrix: BLOSUM62

[0099] Open Gap: 11 and Extension Gap: 1 penalties

[0100] Gap x drop-off: 50

[0101] Expect: 10

[0102] Word Size: 3

[0103] Filter: on

[0104] Percent identity may be measured over the length of an entiredefined polypeptide sequence, for example, as defined by a particularSEQ ID number, or may be measured over a shorter length, for example,over the length of a fragment taken from a larger, defined polypeptidesequence, for instance, a fragment of at least 15, at least 20, at least30, at least 40, at least 50, at least 70 or at least 150 contiguousresidues. Such lengths are exemplary only, and it is understood that anyfragment length supported by the sequences shown herein, in the tables,figures or Sequence Listing, may be used to describe a length over whichpercentage identity may be measured.

[0105] “Human artificial chromosomes” (HACs) are linear microchromosomeswhich may contain DNA sequences of about 6 kb to 10 Mb in size and whichcontain all of the elements required for chromosome replication,segregation and maintenance.

[0106] The term “humanized antibody” refers to an antibody molecule inwhich the amino acid sequence in the non-antigen binding regions hasbeen altered so that the antibody more closely resembles a humanantibody, and still retains its original binding ability.

[0107] “Hybridization” refers to the process by which a polynucleotidestrand anneals with a complementary strand through base pairing underdefined hybridization conditions. Specific hybridization is anindication that two nucleic acid sequences share a high degree ofcomplementarity. Specific hybridization complexes form under permissiveannealing conditions and remain hybridized after the “washing” step(s).The washing step(s) is particularly important in determining thestringency of the hybridization process, with more stringent conditionsallowing less non-specific binding, i.e., binding between pairs ofnucleic acid strands that are not perfectly matched. Permissiveconditions for annealing of nucleic acid sequences are routinelydeterminable by one of ordinary skill in the art and may be consistentamong hybridization experiments, whereas wash conditions may be variedamong experiments to achieve the desired stringency, and thereforehybridization specificity. Permissive annealing conditions occur, forexample, at 68° C. in the presence of about 6×SSC, about 1% (w/v) SDS,and about 100 μg/ml sheared, denatured salmon sperm DNA.

[0108] Generally, stringency of hybridization is expressed, in part,with reference to the temperature under which the wash step is carriedout. Such wash temperatures are typically selected to be about 5° C. to20° C. lower than the thermal melting point (T_(m)) for the specificsequence at a defined ionic strength and pH. The T_(m) is thetemperature (under defined ionic strength and pH) at which 50% of thetarget sequence hybridizes to a perfectly matched probe. An equation forcalculating T_(m) and conditions for nucleic acid hybridization are wellknown and can be found in Sambrook, J. et al. (1989) Molecular Cloning:A Laboratory Manual, 2^(nd) ed., vol. 1-3, Cold Spring Harbor Press,Plainview N.Y.; specifically see volume 2, chapter 9.

[0109] High stringency conditions for hybridization betweenpolynucleotides of the present invention include wash conditions of 68°C. in the presence of about 0.2×SSC and about 0.1% SDS, for 1 hour.Alternatively, temperatures of about 65° C., 60° C., 55° C., or 42° C.may be used. SSC concentration may be varied from about 0.1 to 2×SSC,with SDS being present at about 0.1%. Typically, blocking reagents areused to block non-specific hybridization. Such blocking reagentsinclude, for instance, sheared and denatured salmon sperm DNA at about100-200 μg/ml. Organic solvent, such as formamide at a concentration ofabout 35-50% v/v, may also be used under particular circumstances, suchas for RNA:DNA hybridizations. Useful variations on these washconditions will be readily apparent to those of ordinary skill in theart. Hybridization, particularly under high stringency conditions, maybe suggestive of evolutionary similarity between the nucleotides. Suchsimilarity is strongly indicative of a similar role for the nucleotidesand their encoded polypeptides.

[0110] The term “hybridization complex” refers to a complex formedbetween two nucleic acid sequences by virtue of the formation ofhydrogen bonds between complementary bases. A hybridization complex maybe formed in solution (e.g., C₀t or R₀t analysis) or formed between onenucleic acid sequence present in solution and another nucleic acidsequence immobilized on a solid support (e.g., paper, membranes,filters, chips, pins or glass slides, or any other appropriate substrateto which cells or their nucleic acids have been fixed).

[0111] The words “insertion” and “addition” refer to changes in an aminoacid or nucleotide sequence resulting in the addition of one or moreamino acid residues or nucleotides, respectively.

[0112] “Immune response” can refer to conditions associated withinflammation, trauma, immune disorders, or infectious or geneticdisease, etc. These conditions can be characterized by expression ofvarious factors, e.g., cytokines, chemokines, and other signalingmolecules, which may affect cellular and systemic defense systems.

[0113] An “immunogenic fragment” is a polypeptide or oligopeptidefragment of PRTS which is capable of eliciting an immune response whenintroduced into a living organism, for example, a mammal. The term“immunogenic fragment” also includes any polypeptide or oligopeptidefragment of PRTS which is useful in any of the antibody productionmethods disclosed herein or known in the art.

[0114] The term “microarray” refers to an arrangement of a plurality ofpolynucleotides, polypeptides, or other chemical compounds on asubstrate.

[0115] The terms “element” and “array element” refer to apolynucleotide, polypeptide, or other chemical compound having a uniqueand defined position on a microarray.

[0116] The term “modulate” refers to a change in the activity of PRTS.For example, modulation may cause an increase or a decrease in proteinactivity, binding characteristics, or any other biological, functional,or immunological properties of PRTS.

[0117] The phrases “nucleic acid” and “nucleic acid sequence” refer to anucleotide, oligonucleotide, polynucleotide, or any fragment thereof.These phrases also refer to DNA or RNA of genomic or synthetic originwhich may be single-stranded or double-stranded and may represent thesense or the antisense strand, to peptide nucleic acid (PNA), or to anyDNA-like or RNA-like material.

[0118] “Operably linked” refers to the situation in which a firstnucleic acid sequence is placed in a functional relationship with asecond nucleic acid sequence. For instance, a promoter is operablylinked to a coding sequence if the promoter affects the transcription orexpression of the coding sequence. Operably linked DNA sequences may bein close proximity or contiguous and, where necessary to join twoprotein coding regions, in the same reading frame.

[0119] “Peptide nucleic acid” (PNA) refers to an antisense molecule oranti-gene agent which comprises an oligonucleotide of at least about 5nucleotides in length linked to a peptide backbone of amino acidresidues ending in lysine. The terminal lysine confers solubility to thecomposition. PNAs preferentially bind complementary single stranded DNAor RNA and stop transcript elongation, and may be pegylated to extendtheir lifespan in the cell.

[0120] “Post-translational modification” of an PRTS may involvelipidation, glycosylation, phosphorylation, acetylation, racemization,proteolytic cleavage, and other modifications known in the art. Theseprocesses may occur synthetically or biochemically. Biochemicalmodifications will vary by cell type depending on the enzymatic milieuof PRTS.

[0121] “Probe” refers to nucleic acid sequences encoding PRTS. theircomplements, or fragments thereof, which are used to detect identical,allelic or related nucleic acid sequences. Probes are isolatedoligonucleotides or polynucleotides attached to a detectable label orreporter molecule. Typical labels include radioactive isotopes, ligands,chemiluminescent agents, and enzymes. “Primers” are short nucleic acids,usually DNA oligonucleotides, which may be annealed to a targetpolynucleotide by complementary base-pairing. The primer may then beextended along the target DNA strand by a DNA polymerase enzyme. Primerpairs can be used for amplification (and identification) of a nucleicacid sequence, e.g., by the polymerase chain reaction (PCR).

[0122] Probes and primers as used in the present invention typicallycomprise at least 15 contiguous nucleotides of a known sequence. Inorder to enhance specificity, longer probes and primers may also beemployed, such as probes and primers that comprise at least 20, 25, 30,40, 50, 60, 70, 80, 90, 100, or at least 150 consecutive nucleotides ofthe disclosed nucleic acid sequences. Probes and primers may beconsiderably longer than these examples, and it is understood that anylength supported by the specification, including the tables, figures,and Sequence Listing, may be used.

[0123] Methods for preparing and using probes and primers are describedin the references, for example Sambrook, J. et al. (1989) MolecularCloning: A Laboratory Manual, 2^(nd) ed, vol. 1-3, Cold Spring HarborPress, Plainview N.Y.; Ausubel F. M. et al. (1987) Current Protocols inMolecular Biology, Greene Publ. Assoc. & Wiley-Intersciences, New YorkN.Y.; Innis, M. et al. (1990) PCR Protocols, A Guide to Methods andApplications, Academic Press, San Diego Calif. PCR primer pairs can bederived from a known sequence, for example, by using computer programsintended for that purpose such as Primer (Version 0.5, 1991, WhiteheadInstitute for Biomedical Research, Cambridge Mass.).

[0124] Oligonucleotides for use as primers are selcted using softwareknown in the art for such purpose. For example, OLIGO 4.06 software isuseful for the selection of PCR primer pairs of up to 100 nucleotideseach, and for the analysis of oligonucleotides and largerpolymncleotides of up to 5,000 nucleotides from an input polynucleotidesequence of up to 32 kilobases. Similar primer selection programs haveincorporated additional features for expanded capabilities. For example,the PrimOU primer selection program (available to the public from theGenoine Center at University of Texas South West Medical Center, DallasTex.) is capable of choosing specific primers from megabase sequencesand is thus useful for designing primers on a genome-wide scope. ThePrimer3 primer selection program (available to the public from theWhitehead Institute/MIT Center for Genome Research, Cambridge Mass.)allows the user to input a “mispriming library,” in which sequences toavoid as primer binding sites are user-specified. Primer3 is useful, inparticular, for the selection of oligonucleotides for microarrays. (Thesource code for the latter two primer selection programs may also beobtained from their respective sources and modified to meet the user'sspecific needs.) The PrimeGen program (available to the public from theUK Human Genome Mapping Project Resource Centre, Cambridge UK) designsprimers based on multiple sequence alignnents, thereby allowingselection of primers that hybridize to either the most conserved orleast conserved regions of aligned nucleic acid sequences. Hence, thisprogram is useful for identification of both unique and conservedoligonucleotides and polynucleotide fragments. The oligonucleotides andpolynucleotide fragments identified by any of the above selectionmethods are useful in hybridization technologies, for example, as PCR orsequencing primers, microarray elements, or specific probes to identifyfully or partially complementary polynucleotides in a sample of nucleicacids. Methods of oligonucleotide selection are not limited to thosedescribed above.

[0125] A “recombinant nucleic acid” is a sequence that is not naturallyoccuring or has a sequence that is made by an artificial combination oftwo or more otherwise separated segments of sequence. This artificialcombination is often accomplished by chemical synthesis or, morecommonly, by the artificial manipulation of isolated segments of nucleicacids, e.g., by genetic engineering techniques such as those describedin Sambrook, supra, The term recombinant includes nucleic acids thathave been altered solely by addition, substitution, or deletion of aportion of the nucleic acid. Frequently, a recombinant nucleic acid mayinclude a nucleic acid sequence operably linked to a promoter sequence.Such a recombinant nucleic acid may be part of a vector that is used,for example, to transform a cell.

[0126] Alternatively, such recombinant nucleic acids may be part of aviral vector, e.g., based on a vaccinia virus, that could be use tovaccinate a mammal wherein the recombinant nucleic acid is expressed,inducing a protective immunological response in the mammal.

[0127] A “regulatory element” refers to a nucleic acid sequence usuallyderived from untranslated regions of a gene and includes enhancers,promoters, introns, and 5′ and 3′ untranslated regions (UTRs).Regulatory elements interact with host or viral proteins which controltranscription, translation, or RNA stability.

[0128] “Reporter molecules” are chemical or biochemical moieties usedfor labeling a nucleic acid, amino acid, or antibody. Reporter moleculesinclude radionuclides; enzymes; fluorescent, chemiluminescent, orchromogenic agents; substrates; cofactors; inhibitors; magneticparticles; and other moieties known in the art.

[0129] An “RNA equivalent,” in reference to a DNA sequence, is composedof the same linear sequence of nucleotides as the reference DNA sequencewith the exception that all occurrences of the nitrogenous base thymineare replaced with uracil, and the sugar backbone is composed of riboseinstead of deoxyribose.

[0130] The term “sample” is used in its broadest sense. A samplesuspected of containing PRTS, nucleic acids encoding PRTS, or fragmentsthereof may comprise a bodily fluid; an extract from a cell, chromosome,organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA,or cDNA, in solution or bound to a substrate; a tissue; a tissue print;etc.

[0131] The terms “specific binding” and “specifically binding” refer tothat interaction between a protein or peptide and an agonist, anantibody, an antagonist, a small molecule, or any natural or syntheticbinding composition. The interaction is dependent upon the presence of aparticular structure of the protein, e.g., the antigenic determinant orepitope, recognized by the binding molecule. For example, if an antibodyis specific for epitope “A,” the presence of a polypeptide comprisingthe epitope A, or the presence of free unlabeled A, in areactioncontaiuing free labeled A and the antibody will reduce theamount of labeled A that binds to the antibody.

[0132] The term “substantially purified” refers to nucleic acid or aminoacid sequences that are removed from their natural environment and areisolated or separated, and are at least 60% free, preferably at least75% free, and most preferably at least 90% free from other componentswith which they are naturally associated.

[0133] A “substitution” refers to the replacement of one or more aminoacid residues or nucleotides by different amino acid residues ornucleotides, respectively.

[0134] “Substrate” refers to any suitable rigid or semi-rigid supportincluding membranes, filters, chips, slides, wafers, fibers, magnetic ornonmagnetic beads, gels, tubing, plates, polymers, microparticles andcapillaries. The substrate can have a variety of surface forms, such aswells, trenches, pins, channels and pores, to which polynucleotides orpolypeptides are bound.

[0135] A “transcript image” refers to the collective pattern of geneexpression by a particular cell type or tissue under given conditions ata given time.

[0136] “Transformation” describes a process by which exogenous DNA isintroduced into a recipient cell. Transformation may occur under naturalor artificial conditions according to various methods well known in theart, and may rely on any known method for the insertion of foreignnucleic acid sequences into a prokaryotic or eukaryotic host cell. Themethod for transformation is selected based on the type of host cellbeing transformed and may include, but is not limited to, bacteriophageor viral infection, electroporation, heat shock, lipofection, andparticle bombardment. The term “transformed cells” includes stablytransformed cells in which the inserted DNA is capable of replicationeither as an autonomously replicating plasmid or as part of the hostchromosome, as well as transiently transformed cells which express theinserted DNA or RNA for limited periods of time.

[0137] A “transgenic organism,” as used herein, is any organism,including but not limited to animals and plants, in which one or more ofthe cells of the organism contains heterologous nucleic acid introducedby way of human intervention, such as by transgenic techniques wellknown in the art. The nucleic acid is introduced into the cell, directlyor indirectly by introduction into a precursor of the cell, by way ofdeliberate genetic manipulation, such as by microinjection or byinfection with a recombinant virus. The term genetic manipulation doesnot include classical cross-breeding, or in vitro fertilization, butrather is directed to the introduction of a recombinant DNA molecule.The transgenic organisms contemplated in accordance with the presentinvention include bacteria, cyanobacteria, fungi, plants and animals.The isolated DNA of the present invention can be introduced into thehost by methods known in the art, for example infection, transfection,transformation or transconjugation. Techniques for transferring the DNAof the present invention into such organisms are widely known andprovided in references such as Sambrook et al. (1989), supra.

[0138] A “variant” of a particular nucleic acid sequence is defined as anucleic acid sequence having at least 40% sequence identity to theparticular nucleic acid sequence over a certain length of one of thenucleic acid sequences using blastn with the “BLAST 2 Sequences” toolVersion 2.0.9 (May 7, 1999) set at default parameters. Such a pair ofnucleic acids may show, for example, at least 50%, at least 60%, atleast 70%, at least 80%, at least 85%, at least 90%, at least 91%, atleast 92%, atleast 93%, at least 94%, at least 95%, at least 96%, atleast 97%, at least 98%, or at least 99% or greater sequence identityover a certain defined length. A variant may be described as, forexample, an “allelic” (as defined above), “splice,” “species,” or“polymorphic” variant. A splice variant may have significant identity toa reference molecule, but will generally have a greater or lesser numberof polynucleotides due to alternate splicing of exons during mRNAprocessing. The corresponding polypeptide may possess additionalfunctional domains or lack domains that are present in the referencemolecule. Species variants are polynucleotide sequences that vary fromone species to another. The resulting polypeptides will generally havesignificant amino acid identity relative to each other. A polymorphicvariant is a variation in the polynucleotide sequence of a particulargene between individuals of a given species. Polymorphic variants alsomay encompass “single nucleotide polymorphisms” (SNPs) in which thepolynucleotide sequence varies by one nucleotide base. The presence ofSNPs may be indicative of, for example, a certain population, a diseasestate, or a propensity for a disease state.

[0139] A “variant” of a particular polypeptide sequence is defined as apolypeptide sequence having at least 40% sequence identity to theparticular polypeptide sequence over a certain length of one of thepolypeptide sequences using blastp with the “BLAST 2 Sequences” toolVersion 2.0.9 (May 7, 1999) set at default parameters. Such a pair ofpolypeptides may show, for example, at least 50%, at least 60%, at least70%, at least 80%, at least 90%, at least 91%, at least 92%, at least93%, atleast 94%, at least 95%, at least 96%, at least 97%, at least98%, or at least 99% or greater sequence identity over a certain definedlength of one of the polypeptides.

The Invention

[0140] The invention is based on the discovery of new human proteases(PRTS), the polynucleotides encoding PRTS, and the use of thesecompositions for the diagnosis, treatment, or prevention ofgastrointestinal, cardiovascular, autoimmnune/inflammatory, cellproliferative, developmental, epithelial, neurological, and reproductivedisorders.

[0141] Table 1 surmarizes the nomenclature for the full lengthpolynucleotide and polypeptide sequences of the invention. Eachpolynucleotide and its corresponding polypeptide are correlated to asingle Incyte project identification number (Incyte Project ID). Eachpolypeptide sequence is denoted by both a polypeptide sequenceidentification number (Polypeptide SEQ ID NO:) and an Incyte polypeptidesequence number (Incyte Polypeptide ID) as shown. Each polynucleotidesequence is denoted by both a polynucleotide sequence identificationnumber (Polynucleotide SEQ ID NO:) and an Incyte polynucleotideconsensus sequence number (Incyte Polynucleotide ID) as shown.

[0142] Table 2 shows sequences with homology to the polypeptides of theinvention as identified by BLAST analysis against the GenBank protein(genpept) database. Columns 1 and 2 show the polypeptide sequenceidentification number (Polypeptide SEQ ID NO:) and the correspondingIncyte polypeptide sequence number (Incyte Polypeptide ID) forpolypeptides of the invention Column 3 shows the GenBank identificationnumber (Genbank: ID NO:) of the nearest GenBank homolog. Column 4 showsthe probability score for the match between each polypeptide and itsGenBank homolog. Column 5 shows the annotation of the GenBank homologalong with relevant citations where applicable, all of which areexpressly incorporated by reference herein.

[0143] Table 3 shows various structural features of the polypeptides ofthe invention. Columns 1 and 2 show the polypeptide sequenceidentification number (SEQ ID NO:) and the corresponding Incytepolypeptide sequence number (Incyte Polypeptide ID) for each polypeptideof the invention. Column 3 shows the number of amino acid residues ineach polypeptide. Column 4 shows potential phosphorylation sites, andcolumn 5 shows potential glycosylation sites, as determined by theMOTIFS program of the GCG sequence analysis software package (GeneticsComputer Group, Madison Wis.). Column 6 shows amino acid residuescomprising signature sequences, domains, and motifs. Column 7 showsanalytical methods for protein structure/function analysis and in somecases, searchable databases to which the analytical methods wereapplied.

[0144] Together, Tables 2 and 3 summnarize the properties ofpolypeptides of the invention, and these properties establish that theclaimed polypeptides are proteases. For example, SEQ ID NO: 1 is 89%identical to a human preprocathepsin L precursor (GenBank ID g190418) asdetermined by the Basic Local Alignment Search Tool (BLAST). (See Table2.) The BLAST probability score is 4.5e-169, which indicates theprobability of obtaining the observed polypeptide sequence alignment bychance. SEQ ID NO: 1 also contains a papain family cysteine proteaseactive site domain as determined by searching for statisticallysignificant matches in the hidden Markov model (HMM)-based PFAM databaseof conserved protein family domains. (See Table 3.) The presence of thismotif is confirmed by BLIMPS, MOTIFS, and PROFILESCAN analyses,providing further corroborative evidence that SEQ ID NO: 1 is a cysteineprotease of the papainfamily. In an alternative example, SEQ ID NO: 6has 44% local identity to Xenopus ovochymase, a polyprotease of thetrypsin family (GenBank ID g2981641), as determined by the Basic LocalAlignment Search Tool (BLAST). (See Table 2.) The BLAST probabilityscore is 6.4e-201, which indicates the probability of obtaining theobserved polypeptide sequence alignment by chance. SEQ ID NO: 6 containsa number of protease active site domains as determined by searching forstatistically significant matches in the hidden Markov model (HMM)-basedPFAM database of conserved protein family domains. (See Table 3.) Thepresence of these motifs is confirmed by BLIMPS, MOTIFS, and PROFILESCANanalyses. These analyses also reveal the presence of kringle and CUBdomains, as well as a signal peptide. Together, these data providefurther corroborative evidence that SEQ ID NO: 6 is a serine protease ofthe trypsin family. In an alternative example, SEQ ID NO: 10 is 50%identical to a human ubiquitin-specific processing protease (GenBank IDg6941888) as determined by the Basic Local Alignment Search Tool(BLAST). (See Table 2.) The BLAST probability score is 7.5e-273, whichindicates the probability of obtaining the observed polypeptide sequencealignment by chance. SEQ ID NO: 10 is also 51% identical to a murineubiquitin-specific processing protease (GenBank ID g6941890) asdetermined by the BLAST analysis with a probability score of 4.0e-271.SEQ ID NO: 10 also contains ubiqaitin carboxyl-terminal hydrolase (i.e.,ubiquitin-specic protease) domains as determined by searching forstatistically significant matches in the hidden Markov model (HMM)-basedPFAM database of conserved protein family domains. (See Table 3.) Datafrom BLIMPS and MOTIFS analyses provide further corroborative evidencethat SEQ ID NO: 10 is a ubiqutin-specific protease. In an alternativeexample, SEQ ID NO: 16 has 52% local identity to Xenopus ADAM13 (GenBankID g1916617) as determined by the Basic Local Alignment Search Tool(13LAST). (See Table 2.) The BLAST probability score is 1.4e-198, whichindicates the probability of obtaining the observed polypeptide sequencealignment by chance. SEQ ID NO: 16 contains a reprolysin family neutralzinc protease active site domain, a reprolysin family propeptide, and adisintegdin domain signature as determined by searching forstatistically significant matches in the hidden Markov model (HMM)-basedPFAM database of conserved protein family domains. (See Table 3.) Thepresence of these domains is confirmed by BLIMPS, MOTIFS, andPROFILESCAN analyses, providing further corroborative evidence that SEQID NO: 16 is a metalloprotease of the ADAM family. In an alternativeexample, SEQ ID NO: 17 is 30% identical to the human zincmetalloprotease ADAMTS6 (GenBank ID g5923786) as determined by CLUSTAL Vanalysis, and 44% local identity, as determined by the Basic LocalAlignment Search Tool (BLAST). (See Table 2.) The BLAST probabilityscore is 9.1e-164, which indicates the probability of obtaining theobserved polypeptide sequence alignment by chance. SEQ ID NO: 17 alsocontains a zinc metalloprotease active site domain, a reprolysin familymetalloprotease propeptide, and a type I thrombospondin domain asdetermined by searching for statistically significant matches in thehidden Markov model (HMM)-based PFAM database of conserved proteinfamily domains. (See Table 3.) Data from BLIMPS analysis provide furthercorroborative evidence that SEQ ID NO: 17 is a metalloprotease of theADAMTS family. SEQ ID NO: 2-5, SEQ ID NO: 7-9, and SEQ ID NO: 11-15 wereanalyzed and annotated in a similar manner. The algorithms andparameters for the analysis of SEQ ID NO: 1-17 are described in Table 7.

[0145] As shown in Table 4, the full length polynacleotide sequences ofthe present iavention were assembled using cDNA sequences or coding(exon) sequences derived from genomic DNA, or any combination of thesetwo types of sequences. Columns 1 and 2 list the polynucleotide sequenceidentification number (Polynucleotide SEQ ID NO:) and the correspondingIncyte polynucleotide consensus sequence number (Incyte PolynucleotideID) for each polynucleotide of the invention. Column 3 shows the lengthof each polynucleotide sequence in basepairs. Column 4 lists fragmentsof the polynucleotide sequences which are useful, for example, inhybridization or amplification technologies that identify SEQ ID NO:18-34 or that distinguish between SEQ ID NO: 18-34 and relatedpolynucleotide sequences. Column 5 shows identification numberscorresponding to cDNA sequences, coding sequences (exons) predicted fromgenomic DNA, and/or sequence assemblages comprised of both cDNA andgenomic DNA. These sequences were used to assemble the full lengthpolynucleotide sequences of the invention. Columns 6 and 7 of Table 4show the nucleotide start (5′) and stop (3′) positions of the cDNAand/or genomic sequences in column 5 relative to their respective fulllength sequences.

[0146] The identification numbers in Column 5 of Table 4 may referspecifically, for example, to Incyte cDNAs along with theircorresponding cDNA libraries. For example, 6917460H1 is theidentification number of an Incyte cDNA sequence, and PLACFER06 is thecDNA library from which it is derived. Incyte cDNAs for which cDNAlibraries are not indicated were derived from pooled cDNA libraries(e.g., 72004319V1). Alternatively, the identification numbers in column5 may refer to GenBak cDNAs or ESTs (e.g., g1365166) which contriuted tothe assembly of the fall length polynucleotide sequences. In addition,the identification numbers in column 5 may identify sequences derivedfrom the ENSEMBL (The Sanger Centre, Cambridge, UK) database (i.e.,those sequences including the designation “ENST”). Alternatively, theidentification numbers in column 5 may be derived from the NCBI RefSeqNucleotide Sequence Records Database (ie., those sequences including thedesignation “NM” or “NT”) or the NCBI RefSeq Protein Sequence Records(i.e., those sequences including the designation “NP”). Alternatively,the identification numbers in column 5 may refer to assemblages of bothcDNA and Genscan-predicted exons brought together by an “exon stitching”algorithm. For example, FL_XXXXXXX_N_(1—)N_(2—)YYYYY_N_(3—)N₄ representsa “stitched” sequence in which XXXXXX is the identification number ofthe cluster of sequences to which the algorithm was applied, and YYYYYis the number of the prediction generated by the algorithm, andN_(1,2,3 . . .) , if present, represent specific exons that may havebeen manually edited during analysis (See Example V). Alternatively, theidentification numbers in column 5 may refer to assemblages of exonsbrought together by an “exon-stretching” algorithm. For example,FLXXXXXXX_gAAAAA_gBBBBB_(—)1_N is the identification number of a“stretched” sequence, XXXXXX being the Incyte project identificationnumber, gAAAAA being the GenBank identification number of the humangenomic sequence to which the “exon-stretching” algorithm was applied,gBBBBB being the GenBank identification number or NCBI RefSeqidentification number of the nearest GenBank protein homolog, and Nreferring to specific exons (See Example V). In instances where a RefSeqsequence was used as a protein homolog for the “exon-stretching”algorithm, a RefSeq identifier (denoted by “NM” “NP,” or “NT”) may beused in place of the GenBank identifier (i.e., gBBBBB).

[0147] Alternatively, a prefix identifies component sequences that werehand-edited, predicted from genomic DNA sequences, or derived from acombination of sequence analysis methods. The following Table listsexamples of component sequence prefixes and corresponding sequenceanalysis methods associated with the prefixes (see Example IV andExample V). Prefix Type of analysis and/or examples of programs GNN,GFG, Exon prediction from genomic sequences using, for ENST example,GENSCAN (Stanford University, CA, USA) or FGENES (Computer GenomicsGroup, The Sanger Centre, Cambridge, UK). GBI Hand-edited analysis ofgenomic sequences. FL Stitched or stretched genomic sequences (seeExample V). INCY Full length transcript and exon prediction from mappingof EST sequences to the genome. Genomic location and EST compositiondata are combined to predict the exons and resulting transcript.

[0148] In some cases, Incyte cDNA coverage redundant with the sequencecoverage shown in column 5 was obtained to confirm the final consensuspolynucleotide sequence, but the relevant Incyte cDNA identificationnumbers are not shown.

[0149] Table 5 shows the representative cDNA libraries for those fulllength polynucleotide sequences which were assembled using Incyte cDNAsequences. The representative cDNA library is the Incyte cDNA librarywhich is most frequently represented by the Incyte cDNA sequences whichwere used to assemble and confirm the above polynucleotide sequences.The tissues and vectors which were used to construct the cDNA librariesshown in Table 5 are described in Table 6.

[0150] The invention also encompasses PRTS variants. A preferred PRTSvariant is one which has at least about 80%, or alternatively at leastabout 90%, or even at least about 95% amino acid sequence identity tothe PRTS amino acid sequence, and which contains at least one functionalor structural characteristic of PRTS.

[0151] The invention also encompasses polynucleotides which encode PRTS.In a particular embodiment, the invention encompasses a polynucleotidesequence comprising a sequence selected from the group consisting of SEQID NO: 18-34, which encodes PRTS. The polynucleotide sequences of SEQ IDNO: 18-34, as presented in the Sequence Listing, embrace the equivalentRNA sequences, wherein occurrences of the nitrogenous base thymine arereplaced with uracil, and the sugar backbone is composed of nooseinstead of deoxyribose.

[0152] The invention also encompasses a variant of a polynucleotidesequence encoding PRTS. In particular, such a variant polynucleotidesequence will have at least about 70%, or alternatively at least about85%, or even at least about 95% polynucleotide sequence identity to thepolynucleotide sequence encoding PRTS. A particular aspect of theinvention encompasses a variant of a polynucleotide sequence comprisinga sequence selected from the group consisting of SEQ ID NO: 18-34 whichhas at least about 70%, or alternatively at least about 85%, or even atleast about 95% polynucleotide sequence identity to a nucleic acidsequence selected from the group consisting of SEQ ID NO: 18-34. Any oneof the polynucleotide variants described above can encode an amino acidsequence which contains at least one functional or structuralcharacteristic of PRTS.

[0153] It will be appreciated by those skilled in the art that as aresult of the degeneracy of the genetic code, a multitude ofpolynucleotide sequences encoding PRTS, some bearing minimal similarityto the polynucleotide sequences of any known and naturally occurringgene, may be produced. Thus, the invention contemplates each and everypossible variation of polynucleotide sequence that could be made byselecting combinations based on possible codon choices. Thesecombinations are made in accordance with the standard triplet geneticcode as applied to the polynucleotide sequence of naturally occurringPRTS, and all such variations are to be considered as being specificallydisclosed.

[0154] Although nucleotide sequences which encode PRTS and its variantsare generally capable of hybridizing to the nucleotide sequence of thenaturally occuring PRTS under appropriately selected conditions ofstringency, it may be advantageous to produce nucleotide sequencesencoding PRTS or its derivatives possessing a substantially differentcodon usage, e.g., inclusion of non-naturally occurring codons. Codonsmay be selected to increase the rate at which expression of the peptideoccurs in a particular proklaryotic or eukaryotic host in accordancewith the frequency with which particular codons are utilized by thehost. Other reasons for substantially altering the nucleotide sequenceencoding PRTS and its derivatives without altering the encoded aminoacid sequences include the production of RNA transcripts having moredesirable properties, such as a greater half-life, than transcriptsproduced from the naturally occurring sequence.

[0155] The invention also encompasses production of DNA sequences whichencode PRTS and PRTS derivatives, or fragments thereof, entirely bysynthetic chemistry. After production, the synthetic sequence may beinserted into any of the many available expression vectors and cellsystems using reagents well known in the art. Moreover, syntheticchemistry may be used to introduce mutations into a sequence encodingPRTS or any fragment thereof.

[0156] Also encompassed by the invention are polynucleotide sequencesthat are capable of hybridizing to the claimed polynucleotide sequences,and, in particular, to those shown in SEQ ID NO: 18-34 and fragmentsthereof under various conditions of stringency. (See, e.g., Wahl G. M.and S. L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A. R.(1987) Methods Enzymol. 1521:507-511.) Hybridization conditions,including annealing and wash conditions, are described in “Definitions.”

[0157] Methods for DNA sequencing are well known in the art and may beused to practice any of the embodiments of the invention. The methodsmay employ such enzymes as the klenow fragment of DNA polymerase I,SEQUENASE (US Biochemical, Cleveland Ohio), Taq polymerase (AppliedBiosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech,Piscataway N.J.), or combinations of polymerases and proofreadingexonucleases such as those found in the ELONGASE amplification system(Life Technologies, Gaithersburg Md.). Preferably, sequence preparationis automated with machines such as the MICROLAB 2200 liquid transfersystem (Hamilton, Reno Nev.), PTC200 thermal cycler (MJ Research,Watertown Mass.) and ABI CATALYST 800 thermal cycler (AppliedBiosystems). Sequencing is then carried out using either the ABI 373 or377 DNA sequencing system (Applied Biosystems), the MEGABACE 1000 DNAsequencing system (Molecular Dynamics, Sunnyvale Calif.), or othersystems known in the art. The resulting sequences are analyzed using avariety of algorithms which are well known in the art. (See, e.g.,Ausubel, F. M. (1997) Short Protocols in Molecular Biology, John Wiley &Sons, New York N.Y., unit 7.7; Meyers, R. A. (1995) Molecular Biologyand Biotechnology, Wiley VCH, New York N.Y., pp. 856-853.)

[0158] The nucleic acid sequences encoding PRTS may be extendedutilizing a partial nucleotide sequence and employing various PCR-basedmethods known in the art to detect upstream sequences, such as promotersand regulatory elements. For example, one method which may be employed,restriction-site PCR, uses universal and nested primers to amplifyunknown sequence from genoimc DNA within a cloning vector. (See, e.g.,Sarkar, G. (1993) PCR Methods Applic. 2:318-322.) Another method,inverse PCR, uses primers that extend in divergent directions to amplifyunknown sequence from a circularized template. The template is derivedfrom restriction fragments comprising a known genomic locus andsurrounding sequences. (See, e.g., Triglia, T. et al. (1988) NucleicAcids Res. 16:8186.) A third method, capture PCR, involves PCRamplification of DNA fragments adjacent to known sequences inhuman andyeast artificial chromosome DNA. (See, e.g., Lagerstrom, M. et al.(1991) PCR Methods Applic. 1:111-119.) In this method, multiplerestriction enzyme digestions and ligations may be used to insert anengineered double-stranded sequence into a region of unknown sequencebefore performing PCR. Other methods which may be used to retrieveunkown sequences are known in the art. (See, e.g., Parker, J. D. etal(1991) Nucleic Acids Res. 19:3055-3060). Additionally, one may usePCR, nested primers, and PROMOTERFINDER libraries (Clontech, Palo AltoCalif.) to walk genomic DNA. This procedure avoids the need to screenlibraries and is useful in finding intron/exon junctions. For allPCR-based methods, primers may be designed using commercially availablesoftware, such as OLIGO 4.06 primer analysis software (NationalBiosciences, Plymouth Minn.) or another appropriate program, to be about22 to 30 nucleotides in length, to have a GC content of about 50% ormore, and to anneal to the template at temperatures of about 680° C. to72° C.

[0159] When screening for full length cDNAs, it is preferable to uselibraries that have been size-selected to include larger cDNAs. Inaddition, random-primed libraries, which often include sequencescontaining the 5′ regions of genes, are preferable for situations inwhich an oligo d(T) library does not yield a full-length cDNA. Genomiclibraries may be useful for extension of sequence into 5′non-transcribed regulatory regions.

[0160] Capillary electrophoresis systems which are commerciallyavailable may be used to analyze the size or confirm the nucleotidesequence of sequencing or PCR products. In particular, capillarysequencing may employ flowable polymers for electrophoretic separation,four different nucleotide-specific, laser-stimulated fluorescent dyes,and a charge coupled device camera for detection of the emittedwavelengths. Output/light intensity may be converted to electricalsignal using appropriate software (e.g., GENOTYPER and SEQUENCENAVIGATOR, Applied Biosystems), and the entire process from loading ofsamples to computer analysis and electronic data display may be computercontrolled. Capillary electrophoresis is especially preferable forsequencing small DNA fragments which may be present in limited amountsin a particular sample.

[0161] In another embodiment of the invention, polynucleotide sequencesor fragments thereof which encode PRTS may be cloned in recombinant DNAmolecules that direct expression of PRTS, or fragments or functionalequivalents thereof, in appropriate host cells. Due to the inherentdegeneracy of the genetic code, other DNA sequences which encodesubstantially the same or a functionally equivalent amino acid sequencemay be produced and used to express PRTS.

[0162] The nucleotide sequences of the present invention can beengineered using methods generally known in the art in order to alterPRTS-encoding sequences for a variety of purposes including, but notlimited to, modification of the cloning, processing, and/or expressionof the gene product. DNA shuffling by random fragmentation and PCRreassembly of gene fragments and synthetic oligonucleotides may be usedto engineer the nucleotide sequences. For example,oligonucleotide-mediated site-directed mutagenesis may be used tointroduce mutations that create new restriction sites, alterglycosylation patterns, change codon preference, produce splicevariants, and so forth.

[0163] The nucleotides of the present invention may be subjected to DNAshuffling techniques such as MOLECULARBREEDING (Maxygen Inc., SantaClara Calif.; described in U.S. Pat. No. 5,837,458; Chang, C.-C. et al.(1999) Nat. Biotechnol. 17:793-797; Christians, F. C. et al. (1999) Nat.Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol.14:315-319) to alter or improve the biological properties of PRTS, suchas its biological or enzymatic activity or its ability to bind to othermolecules or compounds. DNA shuffling is a process by which a library ofgene variants is produced using PCR-mediated recombination of genefragments. The library is then subjected to selection or screeningprocedures that identify those gene variants with the desiredproperties. These preferred variants may then be pooled and furthersubjected to recursive rounds of DNA shuffling and selection/screening.Thus, genetic diversity is created through “artificial” breeding andrapid molecular evolution. For example, fragments of a single genecontaining random point mutations may be recombined, screened, and thenreshuffled until the desired properties are optimized. Alternatively,fragments of a given gene may be recombined with fragments of homologousgenes in the same gene family, either from the same or differentspecies, thereby maximizing the genetic diversity of multiple naturallyoccurring genes in a directed and controllable manner.

[0164] In another embodiment, sequences encoding PRTS may besynthesized, in whole or in part, using chemical methods well known inthe art. (See, e.g., Caruthers, M. H. et al. (1980) Nucleic Acids Symp.Ser. 7:215-223; and Horn, T. et al. (1980) Nucleic Acids Symp. Ser.7:225-232.) Alternatively, PRTS itself or a fragment thereof may besynthesized using chemical methods. For example, peptide synthesis canbe performed using various solution-phase or solid-phase techniques.(See, e.g., Creighton, T. (1984) Proteins, Structures and MolecularProperties, W H Freeman, New York N.Y., pp. 55-60; and Roberge, J. Y. etal. (1995) Science 269:202-204.) Automated synthesis may be achievedusing the ABI 431A peptide synthesizer (Applied Biosystems).Additionally, the amino acid sequence of PRTS, or any part thereof, maybe altered during direct synthesis and/or combined with sequences fromother proteins, or any part thereof, to produce a variant polypeptide ora polypeptide having a sequence of a naturally occurring polypeptide.

[0165] The peptide may be substantially purified by preparative highperformance liquid chromatography. (See, e.g., Chiez, R. M. and F. Z.Regnier (1990) Methods Enzymol. 182:392-421.) The composition of thesynthetic peptides may be confirmed by amino acid analysis or bysequencing. (See, e.g., Creighton, supra, pp. 29-53.)

[0166] In order to express a biologically active PRTS, the nucleotidesequences encoding PRTS or derivatives thereof may be inserted into anappropriate expression vector, i.e., a vector which contains thenecessary elements for transcriptional and translational control of theinserted coding sequence in a suitable host. These elements includeregulatory sequences, such as enhancers, constitutive and induciblepromoters, and 5′ and 3′ untranslated regions in the vector and inpolynucleotide sequences encoding PRTS. Such elements may vary in theirstrength and specificity. Specific initiation signals may also be usedto achieve more efficient translation of sequences encoding PRTS. Suchsignals include the ATG initiation codon and adjacent sequences, e.g.the Kozak sequence. In cases where sequences encoding PRTS and itsinitiation codon and upstream regulatory sequences are inserted into theappropriate expression vector, no additional transcriptional ortranslational control signals may be needed. However, in cases whereonly coding sequence, or a fragment thereof, is inserted, exogenoustranslational control signals including an in-frame ATG initiation codonshould be provided by the vector. Exogenous translational elements andinitiation codons may be of various origins, both natural and synthetic.The efficiency of expression may be enhanced by the inclusion ofenhancers appropriate for the particular host cell system used. (See,e.g., Scharf, D. et al. (1994) Results Probl. Cell Differ. 20:125-162.)

[0167] Methods which are well known to those skilled in the art may beused to construct expression vectors containing sequences encoding PRTSand appropriate transcriptional and translational control elements.These methods include in vitro recombinant DNA techniques, synthetictechniques, and in vivo genetic recombination. (See, e.g., Sambrook, J.et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring HarborPress, Plainview N.Y., ch. 4, 8, and 16-17; Ausubel, F. M. et al. (1995)Current Protocols in Molecular Biology, John Wiley & Sons, New YorkN.Y., ch. 9, 13, and 16.)

[0168] A variety of expression vector/host systems may be utilized tocontain and express sequences encoding PRTS. These include, but are notlimited to, microorganisms such as bacteria transformed with recombinantbacteriophage, plasmid, or cosmid DNA expression vectors; yeasttransformed with yeast expression vectors; insect cell systems infectedwith viral expression vectors (e.g., baculovirus); plant cell systemstransformed with viral expression vectors (e.g., cauliflower mosaicvirus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expressionvectors (e.g., Ti or pBR322 plasmids); or animal cell systems. (See,e.g., Sambrook, supra; Ausubel, supra; Van Heeke, G. and S. M. Schuster(1989) J. Biol. Chem. 264:5503-5509; Engelhard, E. K. et al. (1994)Proc. Natl. Acad. Sci USA 91:3224-3227; Sandig, V. et al. (1996) Hum.Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO J. 6:307-311; TheMcGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, NewYork N.Y., pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad.Sci USA 81:3655-3659; and Harrington, J. J. et al. (1997) Nat. Genet.15:345-355.) Expression vectors derived from retroviruses, adenoviruses,or herpes or vaccinia viruses, or from various bacterial plasmids, maybe used for delivery of nucleotide sequences to the targeted organ,tissue, or cell population. (See, e.g., Di Nicola, M. et al. (1998)Cancer Gen. Ther. 5(6): 350-356; Yu, M. et al. (1993) Proc. Natl Acad.Sci. USA 90(13): 6340-6344; Buller, R. M. et al. (1985) Nature317(6040):813-815; McGregor, D.P. et al. (1994) Mol. Immunol. 31(3):219-226; and Verma, J. M. and N. Somia (1997) Nature 389:239-242.) Theinvention is not limited by the host cell employed.

[0169] In bacterial systems, a number of cloning and expression vectorsmay be selected depending upon the use intended for polynucleotidesequences encoding PRTS. For example, routine cloning, subcloning, andpropagation of polynucleotide sequences encoding PRTS can be achievedusing a multifunctional E. coli vector such as PBLUESCRIPT (Stratagene,La Jolla Calif.) or PSPORT1 plasmnid (Life Technologies). Ligation ofsequences encoding PRTS into the vector's multiple cloning site disruptsthe lacZ gene, allowing a colorimetric screening procedure foridentification of transformed bacteria containing recombinant molecules.In addition, these vectors may be useful for in vitro transcription,dideoy sequencing, single strand rescue with helper phage, and creationof nested deletions in the cloned sequence. (See, e.g., Van Heeke, G.and S. M. Schuster (1989) J. Biol. Chem. 264:5503-5509.) When largequantities of PRTS are needed, e.g. for the production of anti-bodies,vectors which direct high level expression of PRTS may be used. Forexample, vectors containing the strong, inducible SP6 or T7bacteriophage promoter may be used.

[0170] Yeast expression systems may be used for production of PRTS. Anumber of vectors containing constitutive or inducible promoters, suchas alpha factor, alcohol oxidase, and PGH promoters, may be used in theyeast Saccharomyces cerevisiae or Pichia pastoris. In addition, suchvectors direct either the secretion or intracellular retention ofexpressed proteins and enable integration of foreign sequences into thehost genome for stable propagation. (See, e.g., Ausubel 1995, supra;Bitter, G. A. et al. (1987) Methods Enzymol. 153:516-544; and Scorer, C.A. et al. (1994) Bio/Technology 12:181-184.)

[0171] Plant systems may also be used for expression of PRTS.Transcription of sequences encoding PRTS may be driven by viralpromoters, e.g., the 35S and 19S promoters of CaMV used alone or incombination with the omega leader sequence from TMV (Takamatsu, N.(1987) EMBO J. 6:307-311). Alternatively, plant promoters such as thesmall subunit of RUBISCO or heat shock promoters may be used. (See,e.g., Coruzzi, G. et al. (1984) EMBO J. 3.31671-1680; Broglie, R. et al.(1984) Science 224:838-843; and Winter, J. et al. (1991) Results Probl.Cell Differ. 17:85-105.) These constructs canbe introduced into plantcells by direct DNA transformation or pathogen-mediated transfection.(See, e.g., The McGraw Bill Yearbook of Science and Technology (1992)McGraw Hill, New York N.Y., pp. 191-196.)

[0172] In mammalian cells, a number of viral-based expression systemsmay be utilized. In cases where an adenovirus is used as an expressionvector, sequences encoding PRTS may be ligated into an adenovirustranscription/translation complex consisting of the late promoter andtripartite leader sequence. Insertion in a non-essential E1 or E3 regionof the viral genome may be used to obtain infective virus whichexpresses PRTS in host cells. (See, e.g., Logan, J. and T. Shenk (1984)Proc. Natl. Acad. Sci. USA 81:3655-3659.) In addition, transcriptionenhancers, such as the Rous sarcoma virus (RSV) enhancer, may be used toincrease expression in mammalian host cells. SV40 or EBV-based vectorsway also be used for high-level protein expression.

[0173] Human artificial chromosomes (HACs) may also be employed todeliver larger fragments of DNA than can be contained in and expressedfrom a plasmid. HACs of about 6 kb to 10 Mb are constructed anddelivered via conventional delivery methods (liposomes, polycationicamino polymers, or vesicles) for therapeutic purposes. (See, e.g.,Harrington, J. J. et al. (1997) Nat. Genet. 15:345-355.)

[0174] For long term production of recombinant proteins in mammaliansystems, stable expression of PRTS in cell lines is preferred. Forexample, sequences encoding PRTS can be transformed into cell linesusing expression vectors which may contain viral origins of replicationand/or endogenous expression elements and a selectable marker gene onthe same or on a separate vector. Following the introduction of thevector, cells may be allowed to grow for about 1 to 2 days in enrichedmedia before being switched to selective media. The purpose of theselectable marker is to confer resistance to a selective agent, and itspresence allows growth and recovery of cells which successfully expressthe introduced sequences. Resistant clones of stably transformed cellsmay be propagated using tissue culture techniques appropriate to thecell type.

[0175] Any number of selection systems may be used to recovertransformed cell lines. These include, but are not limited to, theherpes simplex virus thymidine kiase and adeninephosphoribosyltransferase genes, for use in tk and apr cells,respectively. (See, e.g., Wigler, M. et al. (1977) Cell 11:223-232;Lowy, I. et al. (1980) Cell 22:817-823.) Also, antimetabolite,antibiotic, or herbicide resistance can be used as the basis forselection. For example, dhfr confers resistance to methotrexate; neoconfers resistance to the aminoglycosides neomycin and G418; and als andpat confer resistance to chlorsulfuron and phosphinotricinacetyltransferase, respectively. (See, e.g., Wigler, M. et al. (1980)Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al.(1981) J. Mol. Biol. 150:1-14.) Additional selectable genes have beendescribed, e.g., rpB and hisD, which alter cellular requirements formetabolites. (See, e.g., Hartman, S. C. and R. C. Mulligan (1988) Proc.Natl. Acad. Sci. USA 85:8047-8051.) Visible markers, e.g., anthocyanins,green fluorescent proteins (GFP; Clontech), β glucuronidase and itssubstrate β-glucuronide, or luciferase and its substrate luciferin maybe used. These markers can be used not only to identify transformants,but also to quantify the amount of transient or stable proteinexpression attnrbutable to a specific vector system. (See, e.g., Rhodes,Calif. (1995) Methods Mol. Biol. 55:121-131.)

[0176] Although the presence/absence of marker gene expression suggeststhat the gene of interest is also present, the presence and expressionof the gene may need to be confirmed. For example, if the sequenceencoding PRTS is inserted within a marker gene sequence, transfozmedcells containing sequences encoding PRTS can be identified by theabsence of marker gene function. Alternatively, a marker gene can beplaced in tandem with a sequence encoding PRTS under the control of asingle promoter. Expression of the marker gene in response to inductionor selection usually indicates expression of the tandem gene as well.

[0177] In general, host cells that contain the nucleic acid sequenceencoding PRTS and that express PRTS may be identified by a variety ofprocedures known to those of skill in the art. These procedures include,but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCRamplification, and protein bioassay or imiunoassay techniques whichinclude membrane, solution, or chip based technologies for the detectionand/or quantification of nucleic acid or protein sequences.

[0178] Immunological methods for detecting and measuring the expressionof PRTS using either specific polyclonal or monoclonal anti-bodies areknown in the art. Examples of such techniques include enzyme-linkedimmunosorbent assays (ELISAs), radioimmunoassays (RIAs), andfluorescence activated cell sorting (FACS). A two-site, monoclonal-basedimmunoassay utilizing monoclonal antibodies reactive to twonon-interfering epitopes on PRTS is preferred, but a competitive bindingassay may be employed. These and other assays are well known in the art.(See, e.g., Hampton, R. et al. (1990) Serological Methods, a LaboratoryManual, APS Press, St. Paul Minn., Sect. IV; Coligan, J. E. et al.(1997) Current Protocols in Immunology, Greene Pub. Associates andWiley-Interscience, New York N.Y.; and Pound, J. D. (1998)Immunochemical Protocols, Humana Press, Totowa N.J.)

[0179] A wide variety of labels and conjugation techniques are known bythose skilled in the art and may be used in various nucleic acid andamino acid assays. Means for producing labeled hybridization or PCRprobes for detecting sequences related to polynucleotides encoding PRTSinclude oligolabeling, nick translation, end-labeling, or PCRamplification using a labeled nucleotide. Alternatively, the sequencesencoding PRTS, or any fragments thereof, may be cloned into a vector forthe production of an mRNA probe. Such vectors are known in the art, arecommercially available, and may be used to synthesize RNA probes invitro by addition of an appropriate RNA polymerase such as T7, T3, orSP6 and labeled nucleotides. These procedures may be conducted using avariety of commercially available kits, such as those provided byAmersham Pharmacia Biotech, Promega (Madison Wis.), and US Biochemical.Suitable reporter molecules or labels which may be used for ease ofdetection include radionuclides, enzymes, fluorescent, chemiluminescent,or chromogenic agents, as well as substrates, cofactors, inlibitors,magnetic particles, and the like.

[0180] Host cells transformed with nucleotide sequences encoding PRTSmay be cultured under conditions suitable for the expression andrecovery of the protein from cell culture. The protein produced by atransformed cell may be secreted or retained intracellularly dependingon the sequence and/or the vector used. As will be understood by thoseof skill in the art, expression vectors containing polynucleotides whichencode PRTS may be designed to contain signal sequences which directsecretion of PRTS through a prokaryotic or eukayotic cell membrane.

[0181] In addition, a host cell strain may be chosen for its ability tomodulate expression of the inserted sequences or to process theexpressed protein in the desired fashion. Such modifications of thepolypeptide include, but are not limited to, acetylation, carboxylation,glycosylation, phosphorylatior lipidation, and acylation.Post-translational processing which cleaves a “prepro” or “pro” form ofthe protein may also be used to specify protein targeting, folding,and/or activity. Different host cells which have specific cellularmachinery and characteristic mechanisms for post-translationalactivities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available fromthe American Type Culture Collection (ATCC, Manassas Va.) and may bechosen to ensure the correct modification and processing of the foreignprotein.

[0182] In another embodiment of the invention, natural, modified, orrecombinant nucleic acid sequences encoding PRTS may be ligated to aheterologous sequence resulting in translation of a fusion protein inany of the aforementioned host systems. For example, a chimeric PRTSprotein containing a heterologous moiety that can be recognized by acommercially available antibody may facilitate the screening of peptidelibraries for inhibitors of PRTS activity. Heterologous protein andpeptide moieties may also facilitate purification of fusion proteinsusing commercially available affinity matrices. Such moieties include,but are not limited to, glutathione S-transferase (GST), maltose bindingprotein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP),6-His, FLAG, c-myc, and hemagglutinin (HA). GST, MBP, Trx, CBP, and6-His enable purification of their cognate fusion proteins onimmobilized glutathione, maltose, phenylarsine oxide, calrnodulin, andmetal-chelate resins, respectively. FLAG, c-myc, and hemagglutinin (A)enable immunoaffinity purification of fusion proteins using commerciallyavailable monoclonal and polyclonal antibodies that specificallyrecognize these epitope tags. A fusion protein may also be engineered tocontain a proteolytic cleavage site located between the PRTS encodingsequence and the heterologous protein sequence, so that PRTS may becleaved away from the heterologous moiety following purification.Methods for fusion protein expression and purification are discussed inAusubel (1995, supra, ch. 10). A variety of commercially available kitsmay also be used to facilitate expression and purification of fusionproteins.

[0183] In a further embodiment of the invention, synthesis ofradiolabeled PRTS may be achieved in vitro using the TNT rabbitreticulocyte lysate or wheat germ extract system (Promega). Thesesystems couple transcription and translation of protein-coding sequencesoperably associated with the T7, T3, or SP6 promoters. Translation takesplace in the presence of a radiolabeled amino acid precursor, forexample, ³⁵S-methionine.

[0184] PRTS of the present invention or fragments thereof may be used toscreen for compounds that specifically bind to PRTS. At least one and upto a plurality of test compounds may be screened for specific binding toPRTS. Examples of test compounds include antibodies, oligonucleotides,proteins (e.g., receptors), or small molecules.

[0185] In one embodiment, the compound thus identified is closelyrelated to the natural ligand of PRTS, e.g., a ligand or fragmentthereof, a natural substrate, a structural or functional mimetic, or anatural binding partner. (See, e.g., Coligan, J. E. et al. (1991)Current Protocols in Immunology 1(2): Chapter 5.) Similarly, thecompound can be closely related to the natural receptor to which PRTSbinds, or to at least a fragment of the receptor, e.g., the ligandbinding site. In either case, the compound can be rationally designedusing known techniques. In one embodiment, screening for these compoundsinvolves producing appropriate cells which express PRTS, either as asecreted protein or on the cell membrane. Preferred cells include cellsfrom mammals, yeast, Drosophila, or E. coli. Cells expressing PRTS orcell membrane fractions which contain PRTS are then contacted with atest compound and binding, stimulation, or inhibition of activity ofeither PRTS or the compound is analyzed.

[0186] An assay may simply test binding of a test compound to thepolypeptide, wherein binding is detected by a fluorophore, radioisotope,enzyme conjugate, or other detectable label. For example, the assay maycomprise the steps of combing at least one test compound with PRTS,either in solution or affixed to a solid support, and detecting thebinding of PRTS to the compound. Alternatively, the assay may detect ormeasure binding of a test compound in the presence of a labeledcompetitor. Additionally, the assay may be carried out using cell-freepreparations, chemical libraries, or natural product mixtures, and thetest compound(s) may be free in solution or affixed to a solid support.

[0187] PRTS of the present invention or fragments thereof may be used toscreen for compounds. that modulate the activity of PRTS. Such compoundsmay include agonists, antagonists, or partial or inverse agonists. Inone embodiment, an assay is performed under conditions permissive forPRTS activity, wherein PRTS is combined with at least one test compound,and the activity of PRTS in the presence of a test compound is comparedwith the activity of PRTS in the absence of the test compound. A changein the activity of PRTS in the presence of the test compound isindicative of a compound that modulates the activity of PRTS.Alternatively, a test compound is combined with an in vitro or cell-freesystem comprising PRTS under conditions suitable for PRTS activity, andthe assay is performed. In either of these assays, a test compound whichmodulates the activity of PRTS may do so indirectly and need not come indirect contact with the test compound. At least one and up to aplurality of test compounds may be screened.

[0188] In another embodiment, polynucleotides encoding PRTS or theirmammalian homologs may be “knocked out” in an animal model system usinghomologous recombination in embryonic stem (ES) cells. Such techniquesare well known in the art and are useful for the generation of animalmodels of human disease. (See, e.g., U.S. Pat. No. 5,175,383 and U.S.Pat. No. 5,767,337.) For example, mouse ES cells, such as the mouse129/SvJ cell line, are derived from the early mouse embryo and grown inculture. The ES cells are transformed with a vector containing the geneof interest disrupted by a marker gene, e.g., the neomycinphosphotransferase gene (neo; Capecchi, M. R. (1989) Science244:1288-1292). The vector integrates into the corresponding region ofthe host, genome byhomologous recombination. Alternatively, homologousrecombination takes place using the Cre-loxP system to knockout a geneof interest in a tissue- or developmental stage-specific manner (Marth,J. D. (1996) Clin. Invest. 97:1999-2002; Wagner, K. U. et al. (1997)Nucleic Acids Res. 25:4323-4330). Transformed ES cells are identifiedand microinjected into mouse cell blastocysts such as those from theC57BL/6 mouse strain. The blastocysts are surgically transferred topseudopregnant dams, and the resulting chimeric progeny are genotypedand bred to produce heterozygous or homozygous strains. Transgenicanimals thus generated may be tested with potential therapeutic or toxicagents.

[0189] Polynucleotides encoding PRTS may also be manipulated in vitro inES cells derived from human blastocysts. Human ES cells have thepotential to differentiate into at least eight separate cell lineagesincluding endoderm, mesoderm, and ectodermal cell types. These celllineages differentiate into, for example, neural cells, hematopoieticlineages, and cardiomyocytes (Thomson, J. A. et al. (1998) Science282:1145-1147).

[0190] Polynucleotides encoding PRTS can also be used to create “iocin”humanized animals (pigs) or transgenic animals (mice or rats) to modelhuman disease. With knockin technology, a region of a polynucleotideencoding PRTS is injected into animal ES cells, and the injectedsequence integrates into the animal cell genome. Transformed cells areinjected into blastulae, and the blastulae are implanted as descnbedabove. Transgenic progeny or inbred lines are studied and treated withpotential pharmaceutical agents to obtain information on treatment of ahuman disease. Alternatively, a mammal inbred to overexpress PRTS, e.g.,by secreting PRTS in its milk, may also serve as a convenient source ofthat protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74).

Therapeutics

[0191] Chemical and structural similarity, e.g., in the context ofsequences and motifs, exists between regions of PRTS and proteases. Inaddition, the expression of PRTS is closely associated with digestive,lung, neurological, gastrointestinal cardiovascular, urinary,reproductive, fibroblastic, developmental, and endothelial tissues, andalso prostate cancer and other tumorous tissue. Therefore, PRTS appearsto play a role in gastrointestinal, cardiovascular,autoimmune/inflammatory, cell proliferative, developmental, epithelial,neurological, and reproductive disorders. In the treatment of disordersassociated with increased PRTS expression or activity, it is desirableto decrease the expression or activity of PRTS. In the treatment ofdisorders associated with decreased PRTS expression or activity, it isdesirable to increase the expression or activity of PRTS.

[0192] Therefore, in one embodiment, PRTS or a fragment or derivativethereof may be administered to a subject to treat or prevent a disorderassociated with decreased expression or activity of PRTS. Examples ofsuch disorders include, but are not limited to, a gastrointestinaldisorder, such as dysphagia, peptic esophagitis, esophageal spasm,esophageal stricture, esophageal carcinoma, dyspepsia, indigestion,gastritis, gastric carcinoma, anorexia, nausea, emesis, gastroparesis,antral or pyloric edema, abdominal angina, pyrosis, gastroenteritis,intestinal obstruction, infections of the intestinal tract, pepticulcer, cholelithiasis, cholecystitis, cholestasis, pancreatitis,pancreatic carcinoma, biliary tract disease, hepatitis,hyperbilirubinemia, cirrhosis, passive congestion of the liver,hepatoma, infectious colitis, ulcerative colitis, ulcerative proctitis,Crohn's disease, Whipple's disease, Mallory-Weiss syndrome, coloniccarcinoma, colonic obstruction, irritable bowel syndrome, short bowelsyndrome, diarrhea, constipation, gastrointestinal hemorrhage, acquiredimmunodeficiency syndrome (AIDS) enteropathy, jaundice, hepaticencephalopathy, hepatorenal syndrome, hepatic steatosis,hemochromatosis, Wilson's disease, alpha₁-antitrypsin deficiency, Reye'ssyndrome, primary sclerosing cholangitis, liver infarction, portal veinobstruction and thrombosis, centrilobular necrosis, peliosis hepatis,hepatic vein thrombosis, veno-occlusive disease, preeclampsia,eclampsia, acute fatty liver of pregnancy, intrahepatic cholestasis ofpregnancy, and hepatic tumors including nodular hyperplasias, adenonias,and carcinomas; a cardiovascular disorder, such as arteriovenousfistula, atherosclerosis, hypertension, vasculitis, Raynaud's disease,aneurysms, arterial dissections, varicose veins, thrombophlebitis andphlebothrombosis, vascular tumors, and complications of thrombolysis,balloon angioplasty, vascular replacement, and coronary artery bypassgraft surgery, congestive heart failure, ischemic heart disease, anginapectoris, myocardial infarction, hypertensive heart disease,degenerative valvular heart disease, calcific aortic valve stenosis,congenitally bicuspid aortic valve, mitral annular calcification, mitralvalve prolapse, rheumatic fever and rheumatic heart disease, infectiveendocarditis, nonbacterial thrombotic endocarditis, endocarditis ofsystemic lupus erythematosus, carcinoid heart disease, cardiomyopathy,myocarditis, pericarditis, neoplastic heart disease, congenital heartdisease, and complications of cardiac transplantation; anautoimmune/inflammatory disorder, such as acquired immunodeficiencysyndrome (AIDS), Addison's disease, adult respiratory distress syndrome,allergies, ankylosing spondylitis, amyloidosis, anemia, asthma,atherosclerosis, atherosclerotic plaque rupture, autoimmune hemolyticanemia, autoimmune thyroiditis, autoimmunepolyendocrinopathy-candidiasis-ectodermal dystrophy (APECED),bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopicdermatitis, dermatomyositis, diabetes mellitus, emphysema, episodiclymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythemanodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome,gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia,irritable bowel syndrome, multiple sclerosis, myasthenia gravis,myocardial or pericardial inflammation, osteoarthritis, degradation ofarticular cartilage, osteoporosis, pancreatitis, polymyositis,psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma,Sjögren's syndrome, systemic anaphylaxis, systemic lupus erythematosus,systemic sclerosis, thrombocytopenic purpura, ulcerative colitis,uveitis, Werner syndrome, complications of cancer, hemodialysis, andextracorporeal circulation, viral, bacterial, fungal, parasitic,protozoal, and helminthic infections, and trauma; a cell proliferativedisorder such as actinic keratosis, arteriosclerosis, atherosclerosis,bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD),myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera,psoriasis, primary thrombocythemia, and cancers includingadenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma,teratocarcinoma, and, in particular, cancers of the adrenal gland,bladder, bone, bone marrow, brain, breast, cervix, gall bladder,ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle,ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin,spleen, testis, thymus, thyroid, and uterus; a developmental disorder,such as renal tubular acidosis, anemia, Cushing's syndrome,achondroplastic dwarfism, Duchenne and Becker muscular dystrophy, boneresorption, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor,aniridia, genitourinary abnormalities, and mental retardation),Smith-Magenis syndrome, myelodysplastic syndrome, hereditarymucoepithelial dysplasia, hereditary keratodermas, hereditarynueuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis,hypothyroidism, hydrocephalus, seizure disorders such as Syndenham'schorea and cerebral palsy, spina bifida, anencephaly,craniorachischisis, congenital glaucoma, cataract, age-related maculardegeneration, and sensorineural hearing loss; an epithelial disorder,such as dyshidrotic eczema, allergic contact dermatitis, keratosispilaris, melasma, vitiligo, actinic keratosis, basal cell carcinoma,squamous cell carcinoma, seborrheic keratosis, folliculitis, herpessimplex, herpes zoster, varicella, candidiasis, dermatophytosis,scabies, insect bites, cherry angioma, keloid, dermatofibroma,acrochordons, urticaria, transient acantholytic dermatosis, xerosis,eczema, atopic dermatitis, contact dermatitis, hand eczema, nummulareczema, lichen simplex chronicus, asteatotic eczema, stasis dermatitisand stasis ulceration, seborrheic dermatitis, psoriasis, lichen planus,pityriasis rosea, impetigo, ecthyma, dermatophytosis, tinea versicolor,warts, acne vulgaris, acne rosacea, pemphigus vulgaris, pemphigusfoliaceus, paraneoplastic pemphigus, bullous pemphigoid, herpesgestationis, dermatitis herpetiformis, linear IgA disease, epidermolysisbullosa acquisita, dermatomyositis, lupus erythematosus, scleroderma andmorphea, erythroderma, alopecia, figurate skin lesions, telangiectasias,hypopigmentation, hyperpigmentation, vesicles/bullae, exanthems,cutaneous drug reactions, papulonodular skin lesions, chronicnon-healing wounds, photosensitivity diseases, epidermolysis bullosasimplex, epidermolytic hyperkeratosis, epidernolytic andnonepideimolytic palmoplantar keratoderma, ichthyosisbullosa of Siemens,ichthyosis exfoliativa, keratosis palmaris et plantaris, keratosispalmoplantaris, palmoplantar keratoderma, keratosis punctata, Meesmann'scorneal dystrophy, pachyonychia congenita, white sponge nevus,steatocystoma multiplex, epidermal nevi/epidermolytic hyperleratosistype, monilethrix, trichothiodystrophy, chronic hepatitis/cryptogeniccirrhosis, and colorectal hyperplasia; a neurological disorder, such asepilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms,Alzheimer's disease, Pick's disease, Huntington's disease, dementia,Parkinson's disease and other extrapyramidal disorders, amyotrophiclateral sclerosis and other motor neuron disorders, progressive neuralmuscular atrophy, retinitis pigmentosa, hereditary ataxias, multiplesclerosis and other demyelinating diseases, bacterial and viralmeningitis, brain abscess, subdural empyema, epidural abscess,suppurative intracranial thrombophlebitis, myelitis and radiculitis,viral central nervous system disease, prion diseases including kuru,Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome,fatal familial insomnia, nutritional and metabolic diseases of thenervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinalhemangioblastomatosis, encephalotrigeminal syndrome, mental retardationand other developmental disorders of the central nervous systemincluding Down syndrome, cerebral palsy, neuroskeletal disorders,autonomic nervous system disorders, cranial nerve disorders, spinal corddiseases, muscular dystrophy and other neuromuscular disorders,peripheral nervous system disorders, dermatomyositis and polymyositis,inherited, metabolic, endocrine, and toxic myopathies, myastheniagravis, periodic paralysis, mental disorders including mood, anxiety,and schizophrenic disorders, seasonal affective disorder (SAD),akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia,dystonias, paranoid psychoses, postherpetic neuralgia, Tourette'sdisorder, progressive supranuclear palsy, corticobasal degeneration, andfamilial frontotemporal dementia; and a reproductive disorder, such asinfertility, including tubal disease, ovulatory defects, andendometriosis, a disorder of prolactin production, a disruption of theestrous cycle, a disruption of the menstrual cycle, polycystic ovarysyndrome, ovarian hyperstimulation syndrome, an endometrial or ovariantumor, a uterine fibroid, autoimmune disorders, an ectopic pregnancy,and teratogenesis; cancer of the breast, fibrocystic breast disease, andgalactorrhea; a disruption of spermatogenesis, abnormal spermphysiology, cancer of the testis, cancer of the prostate, benignprostatic hyperplasia, prostatitis, Peyronie's disease, impotence,carcinoma of the male breast, and gynecomastia.

[0193] In another embodiment, a vector capable of expressing PRTS or afragment or derivative thereof may be administered to a subject to treator prevent a disorder associated with decreased expression or activityof PRTS including, but not limited to, those described above.

[0194] In a further embodiment, a composition comprising a substantiallypurified PRTS in conjunction with a suitable pharmaceutical carrier maybe administered to a subject to treat or prevent a disorder associatedwith decreased expression or activity of PRTS including, but not limitedto, those provided above.

[0195] In still another embodiment, an agonist which modulates theactivity of PRTS may be administered to a subject to treat or prevent adisorder associated with decreased expression or activity of PRTSincluding, but not limited to, those listed above.

[0196] In a further embodiment, an antagonist of PRTS may beadministered to a subject to treat or prevent a disorder associated withincreased expression or activity of PRTS. Examples of such disordersinclude, but are not limited to, those gastrointestinal, cardiovascular,autoimmune/inflammatory, cell proliferative, developmental, epithelial,neurological, and reproductive disorders described above. In one aspect,an antibody which specifically binds PRTS may be used directly as anantagonist or indirectly as a targeting or delivery mechanism forbringing a pharmaceutical agent to cells or tissues which express PRTS.

[0197] In an additional embodiment, a vector expressing the complementof the polynucleotide encoding PRTS may be administered to a subject totreat or prevent a disorder associated with increased expression oractivity of PRTS including, but not limited to, those described above.

[0198] In other embodiments, any of the proteins, antagonists,antibodies, agonists, complementary sequences, or vectors of theinvention may be administered in combination with other appropriatetherapeutic agents. Selection of the appropriate agents for use incombination therapy may be made by one of ordinary skill in the art,according to conventional pharmaceutical principles. The combination oftherapeutic agents may act synergistically to effect the treatment orprevention of the various disorders described above. Using thisapproach, one may be able to achieve therapeutic efficacy with lowerdosages of each agent, thus reducing the potential for adverse sideeffects.

[0199] An antagonist of PRTS may be produced using methods which aregenerally known in the art. In particular, purified PRTS may be used toproduce antibodies or to screen libraries of pharmaceutical agents toidentify those which specifically bind PRTS. Antibodies to PRTS may alsobe generated using methods that are well known in the art. Suchantibodies may include, but are not limited to, polyclonal, monoclonal,chimeric, and single chain antibodies, Fab fragments, and fragmentsproduced by a Fab expression library. Neutralizing antibodies (i.e.,those which inhibit dimer formation) are generally preferred fortherapeutic use.

[0200] For the production of antibodies, various hosts including goats,rabbits, rats, mice, humans, and others may be immunized by injectionwith PRTS or with any fragment or oligopeptide thereof which hasimmunogenic properties. Depending on the host species, various adjuvantsmay be used to increase immunological response. Such adjuvants include,but are not limited to, Freund's, mineral gels such as aluminumhydroxide, and surface active substances such as lysolecithin, pluronicpolyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol.Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) andCorynebacterium parvum are especially preferable.

[0201] It is preferred that the oligopeptides, peptides, or fragmentsused to induce antibodies to PRTS have an amino acid sequence consistingof at least about 5 amino acids, and generally will consist of at leastabout 10 amino acids. It is also preferable that these oligopeptides,peptides, or fragments are identical to a portion of the amino acidsequence of the natural protein. Short stretches of PRTS amino acids maybe fused with those of another protein, such as KLH, and antibodies tothe chimeric molecule may be produced.

[0202] Monoclonal antibodies to PRTS may be prepared using any techniquewhich provides for the production of antibody molecules by continuouscell lines in culture. These include, but are not limited to, thehybridoma technique, the human B-cell hybridoma technique, and theEBV-hybridoma technique. (See, e.g., Kobler, G. et al. (1975) Nature256:495-497; Kozbor, D. et al. (1985) J. Immunol. Methods 81:31-42;Cote, R. J. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; andCole, S. P. et al. (1984) Mol. Cell Biol. 62:109-120.)

[0203] In addition, techniques developed for the production of “chimericantibodies,” such as the splicing of mouse antibody genes to humanantibody genes to obtain a molecule with appropriate antigen specificityand biological activity, can be used. (See, e.g., Morrison, S. L. et al.(1984) Proc. Natl. Acad. Sci. USA 81:6851-6855; Neuberger, M. S. et al.(1984) Nature 312:604-608; and Takeda, S. et al. (1985) Nature314:452-454.) Alternatively, techniques described for the production ofsingle chain antibodies may be adapted, using methods known in the art,to produce PRTS-specific single chain antibodies. Antibodies withrelated specificity, but of distinct idiotypic composition, may begenerated by chain shuffling from random combinatorial-immuno globullibraries. (See, e.g., Burton, D. R. (1991) Proc. Natl. Acad. Sci. USA88:10134-10137.)

[0204] Antibodies may also be produced by inducing in vivo production inthe lymphocyte population or by screening immunoglobulin libraries orpanels of highly specific binding reagents as disclosed in theliterature. (See, e.g., Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci.USA 86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299.)

[0205] Antibody fragments which contain specific binding sites for PRTSmay also be generated. For example, such fragments include, but are notlimited to, F(ab′)₂ fragments produced by pepsin digestion of theantibody molecule and Fab fragments generated by reducing the disulfidebridges of the F(ab′)2 fragments. Alternatively, Fab expressionlibraries may be constructed to allow rapid and easy identification ofmonoclonal Fab fragments with the desired specificity. (See, e.g., Huse,W. D. et al. (1989) Science 246:1275-1281.)

[0206] Various immunoassays may be used for screening to identifyantibodies having the desired specificity. Numerous protocols forcompetitive binding or immunoradiometric assays using either polyclonalor monoclonal antibodies with established specificities are well knownin the art. Such inmmunoassays typically involve the measurement ofcomplex formation between PRTS and its specific antibody. A two-site,monoclonal-based immunoassays utilizing monoclonal antibodies reactiveto two non-interfering PRTS epitopes is generally used, but acompetitive binding assay may also be employed (Pound, supra).

[0207] Various methods such as Scatchard analysis in conjunction withradioimmunoassay techniques may be used to assess the affinity ofantibodies for PRTS. Affinity is expressed as an association constant,K_(a), which is defined as the molar concentration of PRTS-antibodycomplex divided by the molar concentrations of free antigen and freeantibody under equilibrium conditions. The K_(a) determined for apreparation of polyclonal antibodies, which are heterogeneous in theiraffinities for multiple PRTS epitopes, represents the average affinity,or avidity, of the antibodies for PRTS. The K_(a) determined for apreparation of monoclonal antibodies, which are monospecific for aparticular PRTS epitope, represents a true measure of affinity.High-affinity antibody preparations with K_(a) ranging from about 10⁹ to10¹² L/mole are preferred for use in immunoassays in which thePRTS-antibody complex must withstand rigorous manipulations.Low-affinity antibody preparations with K_(a) ranging from about 10⁶ to10⁷ L/mole are preferred for use in immunopurification and similarprocedures which ultimately require dissociation of PRTS, preferably inactive form, from the antibody (Catty, D. (1988) Antibodies, Volume I: APractical Approach, IRL Press, Washington D.C.; Liddell, J. E. and A.Cryer (1991) A Practical Guide to Monoclonal Antibodies, John Wiley &Sons, New York N.Y.).

[0208] The titer and avidity of polyclonal antibody preparations may befurther evaluated to determine the quality and suitability of suchpreparations for certain downstream applications. For example, apolyclonal antibody preparation containing at least 1-2 mg specificantibody/ml, preferably 5-10 mg specific antibody/ml, is generallyemployed in: procedures requiring precipitation of PRTS-antibodycomplexes. Procedures for evaluating antibody specificity, titer, andavidity, and guidelines for antibody quality and usage in variousapplications, are generally available. (See, e.g., Catty, supra, andColigan et al. supra.)

[0209] In another embodiment of the invention, the polynucleotidesencoding PRTS, or any fragment or complement thereof, may be used fortherapeutic purposes. In one aspect, modifications of gene expressioncan be achieved by designing complementary sequences or antisensemolecules (DNA, RNA, PNA, or modified oligonucleotides) to the coding orregulatory regions of the gene encoding PRTS. Such technology is wellknown in the art, and antisense oligonucleotides or larger fragments canbe designed from various locations along the coding or control regionsof sequences encoding PRTS. (See, e.g., Agrawal, S., ed. (1996)Antisense Therapeutics, Humana Press Inc., Totawa N.J.)

[0210] In therapeutic use, any gene delivery system suitable forintroduction of the antisense sequences into appropriate target cellscan be used. Antisense sequences can be delivered intracellularly in theform of an expression plasmid which, upon transcription, produces asequence complementary to at least a portion of the cellular sequenceencoding the target protein. (See, e.g., Slater, J. E. et al. (1998) J.Allergy Clin. Immunol. 102(3): 469-475; and Scanlon, K. J. et al. (1995)9(13): 1288-1296.) Antisense sequences can also be introducedintracellularly through the use of viral vectors, such as retrovirus andadeno-associated virus vectors. (See, e.g., Miller, A. D. (1990) Blood76:271; Ausubel, supra; Uckert, W. and W. Walther (1994) Pharmacol.Ther. 63(3): 323-347.) Other gene delivery mechanisms includeliposome-derived systems, artificial viral envelopes, and other systemsknown in the art. (See, e.g., Rossi, J. J. (1995) Br. Med. Bull. 51(1):217-225; Boado, R. J. et al. (1998) J. Pharm. Sci. 87(11): 1308-1315;and Morris, M. C. et al. (1997) Nucleic Acids Res. 25(14): 2730-2736.)

[0211] In another embodiment of the invention, polynucleotides encodingPRTS may be used for somatic or germline gene therapy. Gene therapy maybe performed to (i) correct a genetic deficiency (e.g., in the cases ofsevere combined immnunodeficiency (SCID)-XI disease characterized byX-linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science288:669-672), severe combined immunodeficiency syndrome associated withan inherited adenosine deaminase (ADA) deficiency (Blaese, R. M. et al.(1995) Science 270:475-480; Bordignon, C. et al. (1995) Science270:470-475), cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216;Crystal, R. G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R. G.et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familialhypercholesterolemia, and hemophilia resulting from Factor VIII orFactor IX deficiencies (Crystal, R. G. (1995) Science 270:404-410;Verma, I. M. and N. Somia (1997) Nature 389:239-242)), (ii) express aconditionally lethal gene product (e.g., in the case of cancers whichresult from unregulated cell proliferation), or (iii) express a proteinwhich affords protection against intracellular parasites (e.g., againsthuman retroviruses, such as human immunodeficiency virus (HIV)(Baltimore, D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996)Proc. Natl. Acad. Sci. USA. 93:11395-11399), hepatitis B or C virus(HBV, HCV); fungal parasites, such as Candida albicans andParacoccidioides brasiliensis; and protozoan parasites such asPlasmodium falciparum and Trypanosoma cruzi). In the case where agenetic deficiency in PRTS expression or regulation causes disease, theexpression of PRTS from an appropriate population of transduced cellsmay alleviate the clinical manifestations caused by the geneticdeficiency.

[0212] In a further embodiment of the invention, diseases or disorderscaused by deficiencies in PRTS are treated by constructing mammalianexpression vectors encoding PRTS and introducing these vectors bymechanical means into PRTS-deficient cells. Mechanical transfertechnologies for use with cells in vivo or ex vitro include (i) directDNA microinjection into individual cells, (ii) ballistic gold particledelivery, (iii) liposome-mediated transfection, (iv) receptor-mediatedgene transfer, and (v) the use of DNA transposons (Morgan, R. A. and W.F. Anderson (1993) Annu. Rev. Biochem. 62:191-217; Ivics, Z. (1997) Cell91:501-510; Boulay, J-L. and H. Récipon (1998) Curr. Opin Biotechnol.9:445-450).

[0213] Expression vectors that may be effective for the expression ofPRTS include, but are not limited to, the PCDNA 3.1, EPITAG, PRCCMV2,PREP, PVAX, PCR2-TOPOTA vectors (Initrogen, Carlsbad Calif.),PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla, Calif.), andPTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo AltoCalif.). PRTS may be expressed using (i) a constitutively activepromoter, (e.g., from cytomegalovims (CMV), Rous sarcoma virus (RSV),SV40 virus, thymidine kinase (TK), or β-actin genes), (ii) an induciblepromoter (e.g., the tetracycline-regulated promoter (Gossen, M. and HBujard (1992) Proc. Natl. Acac Sci. USA 89:5547-5551; Gossen, M. et al.(1995) Science 268:1766-1769; Rossi, F. M. V. and H. M. Blau (1998)Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REXplasmid (Invitrogen)); the ecdysone-inducible promoter (available in theplasmids PVGRXR and PIND; Invitrogen); the FK506/rapamycin induciblepromoter; or the RU486/mifepristone inducible promoter (Rossi, F. M. V.and Blau, H. M. supra)), or (iii) a tissue-specific promoter or thenative promoter of the endogenous gene encoding PRTS from a normalindividual.

[0214] Commercially available liposome transformation kits (e.g., thePERFECT LIPID TRANSFECTION KIT, available from Invitrogen) allow onewith ordinary skill in the art to deliver polynucleotides to targetcells in culture and require minimal effort to optimise experimentalparameters. In the alternative, transformation is performed using thecalcium phosphate method (Graham, P. L. and A. J. Eb (1973) Virology52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J.1:841-845). The introduction of DNA to primary cells requiresmodification of these standardized mammalian transfection protocols.

[0215] In another embodiment of the invention, diseases or disorderscaused by genetic defects with respect to PRTS expression are treated byconstructing a retrovirus vector consisting of (i) the polynucleotideencoding PRTS under the control of an independent promoter or theretrovirus long terminal repeat (LTR) promoter, (ii) appropriate RNApackaging signals, and (iii) a Rev-responsive element (RRE) along withadditional retrovirus cis-acting RNA sequences and coding sequencesrequired for efficient vector propagation. Retrovirus vectors (e.g., PFBand PFBNEO) are commercially available (Stratagene) and are based onpublished data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. USA92:6733-6737), incorporated by reference herein. The vector ispropagated in an appropriate vector producing cell line (VPCL) thatexpresses an envelope gene with a tropism for receptors on the targetcells or a promiscuous envelope protein such as VSVg (Armentano, D. etal. (1987) J. Virol. 61:1647-1650; Bender, M. A. et al. (1987) J. Virol.61:1639-1646; Adam, M. A. and A. D. Miller (1988) J. Virol.62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey,R. et al. (1998) J. Virol. 72:9873-9880). U.S. Pat. No. 5,910,434 toRigg (“Method for obtaining retrovirus packaging cell lines producinghigh transducing efficiency retroviral supernatant”) discloses a methodfor obtaining retrovirus packaging cell lines and is hereby incorporatedby reference. Propagation of retrovirus vectors, transduction of apopulation of cells (e.g., CD4⁺T-cells), and the return of transducedcells to a patient are procedures well known to persons skilled in theart of gene therapy and have been well documented (Ranga, U. et al.(1997) J. Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood89:2259-2267; Bonyhadi, M. L. (1997) J. Virol. 71:4707-4716; Ranga, U.et al. (1998) Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997)Blood 89:2283-2290).

[0216] In the alternative, an adenovirus-based gene therapy deliverysystem is used to deliver polynucleotides encoding PRTS to cells whichhave one or more genetic abnormalities with respect to the expression ofPRTS. The construction and packaging of adenovirus-based vectors arewell known to those with ordinary skill in the art. Replicationdefective adenovirts vectors have proven to be versatile for importinggenes encoding immunoregulatory proteins into intact islets in thepancreas (Csete, M. E. et al. (1995) Transplantation 27:263-268).Potentially useful adenoviral vectors are described in U.S. Pat. No.5,707,618 to Armentano (“Adenovirus vectors for gene therapy”), herebyincorporated by reference. For adenoviral vectors, see also Antinozzi,P. A. et al. (1999) Annu. Rev. Nutr. 19:511-544 and Verma, L. M. and N.Somia (1997) Nature 18:389:239-242, both incorporated by referenceherein.

[0217] In another alternative, a herpes-based, gene therapy deliverysystem is used to deliver polynucleotides encoding PRTS to target cellswhich have one or more genetic abnormalities with respect to theexpression of PRTS. The use of herpes simplex virus (HSV)-based vectorsmay be especially valuable for introducing PRTS to cells of the centralnervous system, for which HSV has a tropism. The construction andpackaging of herpes-based vectors are well known to those with ordinaryskill in the art. A replication-competent herpes simplex virus (HSV)type 1-based vector has been used to deliver a reporter gene to the eyesof primates (Liu, X. et al. (1999) Exp. Eye Res. 169:385-395). Theconstruction of a HSV-1 virus vector has also been disclosed in detailin U.S. Pat. No. 5,804,413 to DeLuca (“Herpes simplex virus strains forgene transfer”), which is hereby incorporated by reference. U.S. Pat.No. 5,804,413 teaches the use of recombinant HSV d92 which consists of agenome containing at least one exogenous gene to be transferred to acell under the control of the appropriate promoter for purposesincluding human gene therapy. Also taught by this patent are theconstruction and use of recombinant HSV strains deleted for ICP4, ICP27and ICP22. For HSV vectors, see also Goins, W. F. et al. (1999) J.Virol. 73:519-532 and Xu, H. et al. (1994) Dev. Biol. 163:152-161,hereby incorporated by reference. The manipulation of cloned herpesvirussequences, the generation of recombinant virus following thetransfection of multiple plasmids containing different segments of thelarge herpesvirus genomes, the growth and propagation of herpesvirus,and the infection of cells with herpesvirus are techniques well known tothose of ordinary skill in the art.

[0218] In another alternative, an alphavirus (positive, single-strandedRNA virus) vector is used to deliver polynucleotides encoding PRTS totarget cells. The biology of the prototypic alphavirus, Semliki ForestVirus (SFV), has been studied extensively and gene transfer vectors havebeen based on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin.Biotechnol. 9:464469). During alphavirus RNA replication, a subgenomicRNA is generated that normally encodes the viral capsid proteins. Thissubgenomic RNA replicates to higher levels than the full length genomicRNA, resulting in the overproduction of capsid proteins relative to theviral proteins with enzymatic activity (e.g., protease and polymerase).Similarly, inserting the coding sequence for PRTS into the alphavirusgenome in place of the capsid-coding region results in the production ofa large number of PRTS-coding RNAs and the synthesis of high levels ofPRTS in vector transduced cells. While alphavirus infection is typicallyassociated with cell lysis within a few days, the ability to establish apersistent infection in hamster normal kidney cells (BHK-21) with avariant of Sindbis virus (SIN) indicates that the lytic replication ofalphaviruses can be altered to suit the needs of the gene therapyapplication (Dryga, S. A. et al. (1997) Virology 228:74-83). The widehost range of alphaviruses will allow the introduction of PRTS into avariety of cell types. The specific transduction of a subset of cells ina population may require the sorting of cells prior to transduction. Themethods of manipulating infectious cDNA clones of alphaviruses,performing alphavirus cDNA and RNA transfections, and performingalphavirus infections, are well known to those with ordinary skill inthe art.

[0219] Oligonucleotides derived from the transcription initiation site,e.g., between about positions −10 and +10 from the start site, may alsobe employed to inhibit gene expression. Similarly, inhibition can beachieved using triple helix base-pairing methodology. Triple helixpairing is useful because it causes inhibition of the ability of thedouble helix to open sufficiently for the binding of polymerases,transcription factors, or regulatory molecules. Recent therapeuticadvances using triplex DNA have been described in the literature. (See,e.g., Gee, J. E. et al. (1994) in Huber, B. E. and B. J. Carr, Molecularand Immunologic Approaches, Futura Publishing, Mt. Kisco N.Y., pp.163-177.) A complementary sequence or antisense molecule may also bedesigned to block translation of mRNA by preventing the transcript frombinding to ribosomes.

[0220] Ribozymes, enzymatic RNA molecules, may also be used to catalyzethe specific cleavage of RNA. The mechanism of ribozyme action involvessequence-specific hybridization of the ribozyme molecule tocomplementary target RNA, followed by endonucleolytic cleavage. Forexample, engineered hammerhead motif ribozyme molecules may specificallyand efficiently catalyze endonucleolytic cleavage of sequences encodingPRTS.

[0221] Specific ribozyme cleavage sites within any potential RNA targetare initially identified by scanning the target molecule for ribozymecleavage sites, including the following sequences: GUA, GUU, and GUC.Once identified, short RNA sequences of between 15 and 20ribonucleotides, corresponding to the region of the target genecontaining the cleavage site, may be evaluated for secondary structuralfeatures which may render the oligonucleotide inoperable. Thesuitability of candidate targets may also be evaluated by testingaccessibility to hybridization with complementary oligonucleotides usingribonuclease protection assays.

[0222] Complementary ribonucleic acid molecules and ribozymes of theinvention may be prepared by any method known in the art for thesynthesis of nucleic acid molecules. These include techniques forchemically synthesizing oligonucleotides such as solid phasephosphoramidite chemical synthesis. Alternatively, RNA molecules may begenerated by in vitro and in vivo transcription of DNA sequencesencoding PRTS. Such DNA sequences may be incorporated into a widevariety of vectors with suitable RNA polymerase promoters such as T7 orSP6. Alternatively, these cDNA constructs that synthesize complementaryRNA, constitutively or inducibly, can be introduced into cell lines,cells, or tissues.

[0223] RNA molecules may be modified to increase intracellular stabilityand half-life. Possible modifications include, but are not limited to,the addition of flanking sequences at the 5′ and/or 3′ ends of themolecule, or the use of phosphorothioate or 2′ O-methyl rather thanphosphodiesterase linkages within the backbone of the molecule. Thisconcept is inherent in the production of PNAs and can be extended in allof these molecules by the inclusion of nontraditional bases such asinosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-,and similarly modified forms of adenine, cytidine, guanine, thymine, anduridine which are not as easily recognized by endogenous endonucleases.

[0224] An additional embodiment of the invention encompasses a methodfor screening for a compound which is effective in altering expressionof a polynucleotide encoding PRTS. Compounds which may be effective inaltering expression of a specific polynucleotide may include, but arenot limited to, oligonucleotides, antisense oligonucleotides, triplehelix-forming oligonucleotides, transcription factors and otherpolypeptide transcriptional regulators, and non-macromolecular chemicalentities which are capable of interacting with specific polynucleotidesequences. Effective compounds may alter polynucleotide expression byacting as either inhibitors or promoters of polynucleotide expression.Thus, in the treatment of disorders associated with increased PRTSexpression or activity, a compound which specifically inhibitsexpression of the polynucleotide encoding PRTS may be therapeuticallyuseful, and in the treatment of disorders associated with decreased PRTSexpression or activity, a compound which specifically promotesexpression of the polynucleotide encoding PRTS may be therapeuticallyuseful.

[0225] At least one, and up to a plurality, of test compounds may bescreened for effectiveness in altering expression of a specificpolynucleotide. A test compound may be obtained by any method commonlyknown in the art, including chemical modification of a compound known tobe effective in altering polynucleotide expression; selection from anexisting, commercially-available or proprietary library ofnaturally-occurring or non-natural chemical compounds; rational designof a compound based on chemical and/or structural properties of thetarget polynucleotide; and selection from a library of chemicalcompounds created combinatorially or randomly. A sample comprising apolynucleotide encoding PRTS is exposed to at least one test compoundthus obtained. The sample may comprise, for example, an intact orpermeabilized cell, or an in vitro cell-free or reconstitutedbiochemical system. Alterations in the expression of a polynucleotideencoding PRTS are assayed by any method commonly known in the art.Typically, the expression of a specific nucleotide is detected byhybridization with a probe having a nucleotide sequence complementary tothe sequence of the polynucleotide encoding PRTS. The amount ofhybridization may be quantified, thus forming the basis for a comparisonof the expression of the polynucleotide both with and without exposureto one or more test compounds. Detection of a change in the expressionof a polynucleotide exposed to a test compound indicates that the testcompound is effective in altering the expression of the polynucleotide.A screen for a compound effective in altering expression of a specificpolynucleotide can be carried out, for example, using aSchizosaccharomyces pombe gene expression system (Atkins, D. et al.(1999) U.S. Pat. No. 5,932,435; Arndt, G. M. et al. (2000) Nucleic AcidsRes. 28:E15) or a human cell line such as HeLa cell (Clarke, M. L. etal. (2000) Biochem. Biophys. Res. Commun. 268:8-13). A particularembodiment of the present invention involves screening a combinatoriallibrary of oligonucleotides (such as deoxyribonucleotides,ribonucleotides, peptide nucleic acids, and modified oligonucleotides)for antisense activity against a specific polynucleotide sequence(Bruice, T. W. et al. (1997) U.S. Pat. No. 5,686,242; Bruice, T. W. etal. (2000) U.S. Pat. No. 6,022,691).

[0226] Many methods for introducing vectors into cells or tissues areavailable and equally suitable for use in vivo, in vitro, and ex vivo.For ex vivo therapy, vectors may be introduced into stem cells takenfrom the patient and clonally propagated for autologous transplant backinto that same patient. Delivery by transfection, by liposomeinjections, or by polycationic amino polymers may be achieved usingmethods which are well known in the art. (See, e.g., Goldman, C. K. etal. (1997) Nat. Biotechnol. 15:462-466.)

[0227] Any of the therapeutic methods described above may be applied toany subject in need of such therapy, including, for example, mammalssuch as humans, dogs, cats, cows, horses, rabbits, and monkeys.

[0228] An additional embodiment of the invention relates to theadministration of a composition which generally comprises an activeingredient formulated with a pharmaceutically acceptable excipient.Excipients may include, for example, sugars, starches, celluloses, gums,and proteins. Various formulations are commonly known and are thoroughlydiscussed in the latest edition of Remington's Pharmaceutical Sciences(Maack Publishing, Easton Pa.). Such compositions may consist of PRTS,antibodies to PRTS, and mimetics, agonists, antagonists, or inhibitorsof PRTS.

[0229] The compositions utilized in this invention may be administeredby any number of routes including, but not limited to, oral,intravenous, intramuscular, intra-arterial, intramedullary, intrathecal,intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal,intranasal, enteral, topical, sublingual, or rectal means.

[0230] Compositions for pulmonary administration may be prepared inliquid or dry powder form. These compositions are generally aerosolizedimmediately prior to inhalation by the patient. In the case of smallmolecules (e.g. traditional low molecular weight organic drugs), aerosoldelivery of fast-acting formulations is well-known in the art. In thecase of macromolecules (e.g. larger peptides and proteins), recentdevelopments in the field of pulmonary delivery via the alveolar regionof the lung have enabled the practical delivery of drugs such as insulinto blood circulation (see, e.g., Patton, J. S. et al., U.S. Pat. No.5,997,848). Pulmonary delivery has the advantage of administrationwithout needle injection, and obviates the need for potentially toxicpenetration enhancers.

[0231] Compositions suitable for use in the invention includecompositions wherein the active ingredients are contained in aneffective amount to achieve the intended purpose. The determination ofan effective dose is well within the capability of those skilled in theart.

[0232] Specialized forms of compositions may be prepared for directintracellular delivery of macromolecules comprising PRTS or fragmentsthereof. For example, liposome preparations containing acell-impermeable macromolecule may promote cell fusion and intracellulardelivery of the macromolecule. Alternatively, PRTS or a fragment thereofmay be joined to a short cationic N-terminal portion from the HIV Tat-1protein. Fusion proteins thus generated have been found to transduceinto the cells of all tissues, including the brain, in a mouse modelsystem (Schwarze, S. R. et al. (1999) Science 285:1569-1572).

[0233] For any compound, the therapeutically effective dose can beestimated initially either in cell culture assays, e.g., of neoplasticcells, or in animal models such as mice, rats, rabbits, dogs, monkeys,or pigs. An animal model may also be used to determine the appropriateconcentration range and route of administration. Such information canthen be used to determine useful doses and routes for administration inhumans.

[0234] A therapeutically effective dose refers to that amount of activeingredient, for example PRTS or fragments thereof, antibodies of PRTS,and agonists, antagonists or inhibitors of PRTS, which ameliorates thesymptoms or condition. Therapeutic efficacy and toxicity may bedetermined by standard pharmaceutical procedures in cell cultures orwith experimental animals, such as by calculating the ED₅₀ (the dosetherapeutically effective in 50% of the population) or LD₅₀ (the doselethal to 50% of the population) statistics. The dose ratio of toxic totherapeutic effects is the therapeutic index, which can be expressed asthe LD₅₀/ED₅₀ ratio. Compositions which exhibit large therapeuticindices are preferred. The data obtained from cell culture assays andanimal studies are used to formulate a range of dosage for human use.The dosage contained in such compositions is preferably within a rangeof circulating concentrations that includes the ED₅₀ with little or notoxicity. The dosage varies within this range depending upon the dosageform employed, the sensitivity of the patient, and the route ofadministration.

[0235] The exact dosage will be determined by the practitioner, in lightof factors related to the subject requiring treatment. Dosage andadministration are adjusted to provide sufficient levels of the activemoiety or to maintain the desired effect Factors which may be taken intoaccount include the severity of the disease state, the general health ofthe subject, the age, weight, and gender of the subject, time andfrequency of administration, drug combination(s), reactionsensitivities, and response to therapy. Long-acting compositions may beadministered every 3 to 4 days, every week, or biweekly depending on thehalf-life and clearance rate of the particular formulation.

[0236] Normal dosage amounts may vary from about 0.1 μg to 100,000 μg,up to a total dose of about 1 gram, depending upon the route ofadministration. Guidance as to particular dosages and methods ofdelivery is provided in the literature and generally available topractitioners in the art Those skilled in the art will employ differentformulations for nucleotides than for proteins or their inhibitors.Similarly, delivery of polynucleotides or polypeptides will be specificto particular cells, conditions, locations, etc.

Diagnostics

[0237] In another embodiment, antibodies which specifically bind PRTSmay be used for the diagnosis of disorders characterized by expressionof PRTS, or in assays to monitor patients being treated with PRTS oragonists, antagonists, or inhibitors of PRTS. Antibodies useful fordiagnostic purposes may be prepared in the same manner as describedabove for therapeutics. Diagnostic assays for PRTS include methods whichutilize the antibody and a label to detect PRTS inhuman body fluids orin extracts of cells or tissues. The antibodies may be used with orwithout modification, and may be labeled by covalent or non-covalentattachment of a reporter molecule. A wide variety of reporter molecules,several of which are described above, are known in the art and may beused.

[0238] A variety of protocols for measuring PRTS, including ELISAs,RIAs, and FACS, are known in the art and provide a basis for diagnosingaltered or abnormal levels of PRTS expression. Normal or standard valuesfor PRTS expression are established by combining body fluids or cellextracts taken from normal mammalian subjects, for example, humansubjects, with antibodies to PRTS under conditions suitable for complexformation. The amount of standard complex formation may be quantitatedby various methods, such as photometric means. Quantities of PRTSexpressed in subject, control, and disease samples from biopsied tissuesare compared with the standard values. Deviation between standard andsubject values establishes the parameters for diagnosing disease.

[0239] In another embodiment of the invention, the polynucleotidesencoding PRTS may be used for diagnostic purposes. The polynucleotideswhich may be used include oligonucleotide sequences, complementary RNAand DNA molecules, and PNAs. The polynucleotides may be used to detectand quantify gene expression in biopsied tissues in which expression ofPRTS may be correlated with disease. The diagnostic assay may be used todetermine absence, presence, and excess expression of PRTS, and tomonitor regulation of PRTS levels during therapeutic intervention.

[0240] In one aspect, hybridization with PCR probes which are capable ofdetecting polynucleotide sequences, including genomic sequences,encoding PRTS or closely related molecules may be used to identifynucleic acid sequences which encode PRTS. The specificity of the probe,whether it is made from a highly specific region, e.g., the 5′regulatory region, or from a less specific region, e.g., a conservedmotif, and the stringency of the hybridization or amplification willdetermine whether the probe identifies only naturally occurringsequences encoding PRTS, allelic variants, or related sequences.

[0241] Probes may also be used for the detection of related sequences,and may have at least 50% sequence identity to any of the PRTS encodingsequences. The hybridization probes of the subject invention may be DNAor RNA and may be derived from the sequence of SEQ ID NO: 18-34 or fromgenomic sequences including promoters, enhancers, and introns of thePRTS gene.

[0242] Means for producing specific hybridization probes for DNAsencoding PRTS include the cloning of polynucleotide sequences encodingPRTS or PRTS derivatives into vectors for the production of mRNA probes.Such vectors are known in the art, are commercially available, and maybe used to synthesize RNA probes in vitro by means of the addition ofthe appropriate RNA polymerases and the appropriate labeled nucleotides.Hybridization probes may be labeled by a variety of reporter groups, forexample, by radionuclides such as ³²P or ³⁵S, or by enzymatic labels,such as alkaline phosphatase coupled to the probe via avidin/biotincoupling systems, and the like.

[0243] Polynucleotide sequences encoding PRTS may be used for thediagnosis of disorders associated with expression of PRTS. Examples ofsuch disorders include, but are not limited to, a gastrointestinaldisorder, such as dysphagia, peptic esophagitis, esophageal spasm,esophageal stricture, esophageal carcinoma, dyspepsia, indigestion,gastritis, gastric carcinoma, anorexia, nausea, emesis, gastroparesis,antral or pyloric edema, abdominal angina, pyrosis, gastroenteritis,intestinal obstruction, infections of the intestinal tract, pepticulcer, cholelithiasis, cholecystitis, cholestasis, pancreatitis,pancreatic carcinoma, biliary tract disease, hepatitis,hyperbilirubinemia, cirrhosis, passive congestion of the liver,hepatoma, infectious colitis, ulcerative colitis, ulcerative proctitis,Crohn's disease, Whipple's disease, Mallory-Weiss syndrome, coloniccarcinoma, colonic obstruction, irritable. bowel syndrome, short bowelsyndrome, diarrhea, constipation, gastrointestinal hemorrhage, acquiredimmunodeficiency syndrome (AIDS) enteropathy, jaundice, hepaticencephalopathy, hepatorenal syndrome, hepatic steatosis,hemochromatosis, Wilson's disease, alpha₁-antitrypsin deficiency, Reye'ssyndrome, primary sclerosing cholangitis, liver infarction, portal veinobstruction and thrombosis, centrilobular necrosis, peliosis hepatis,hepatic vein thrombosis, veno-occlusive disease, preeclampsia,eclampsia, acute fatty liver of pregnancy, intrahepatic cholestasis ofpregnancy, and hepatic tumors including nodular hyperplasias, adenomas,and carcinomas; a cardiovascular disorder, such as arteriovenousfistula, atherosclerosis, hypertension, vasculitis, Raynaud's disease,aneurysms, arterial dissections, varicose veins, thrombophlebitis andpblebothrombosis, vascular tumors, and complications of thrombolysis,balloon angioplasty, vascular replacement, and coronary artery bypassgraft surgery, congestive heart failure, ischemic heart disease, anginapectoris, myocardial infarction, hypertensive heart disease,degenerative valvular heart disease, calcific aortic valve stenosis,congenitally bicuspid aortic valve, mitral annular calcification, mitralvalve prolapse, rheumatic fever and rheumatic heart disease, infectiveendocarditis, nonbacterial thrombotic endocarditis, endocarditis ofsystemic lupus erythematosus, carcinoid heart disease, cardiomyopathy,myocarditis, pericarditis, neoplastic heart disease, congenital heartdisease, and complications of cardiac transplantation; anautoimmune/inflammatory disorder, such as acquired immunodeficiencysyndrome (AIDS), Addison's disease, adult respiratory distress syndrome,allergies, ankylosing spondylitis, amyloidosis, anemia, asthma,atherosclerosis, atherosclerotic plaque rupture, autoimmune hemolyticanenia, autoimmune thyroiditis, autoimmunepolyendocrinopathy-candidiasis-ectodermal dystrophy (APECED),bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopicdermatitis, dermatomyositis, diabetes mellitus, emphysema, episodiclymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythemanodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome,gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia,irritable bowel syndrome, multiple sclerosis, myasthenia gravis,myocardial or pericardial inflammation, osteoarthritis, degradation ofarticular cartilage, osteoporosis, pancreatitis, polymyositis,psoriasis, Reiter's syndrome, rheumatoid arhritis, scleroderma,Sjögren's syndrome, systemic anaphylaxis, systemic lupus erythematosus,systemic sclerosis, dirombocytopenic purpura, ulcerative colitis,uveitis, Werner syndrome, complications of cancer, hemodialysis, andextracorporeal circulation, viral, bacterial, fungal, parasitic,protozoal, and helminthic infections, and trauma; a cell proliferativedisorder such as actinic keratosis, arteriosclerosis, atherosclerosis,bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD),myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera,psoriasis, primary thrombocythemia, and cancers includingadendcarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma,teratocarcinoma, and, in particular, cancers of the adrenal gland,bladder, bone, bone marrow, brain, breast, cervix, gall bladder,ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle,ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin,spleen, testis, thymus, thyroid, and uterus; a developmental disorder,such as renal tubular acidosis, anemia, Cushing's syndrome,achondroplastic dwarfism, Duchenne and Becker muscular dystrophy, boneresorption, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor,aniridia, genitourinary abnormalities, and mental retardation),Smith-Magenis syndrome, myelodysplastic syndrome, hereditarymucoepithelial dysplasia, hereditary keratodermas, hereditaryneuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis,hypothyroidism, hydrocephalus, seizure disorders such as Syndenham'schorea and cerebral palsy, spina bifida, anencephaly,craniorachischisis, congenital glaucoma, cataract, age-related maculardegeneration, and sensorineural hearing loss; an epithelial disorder,such as dyshidrotic eczema, allergic contact dermatitis, keratosispilaris, melasma, vitiligo, actinic keratosis, basal cell carcinoma,squamous cell carcinoma, seborrheic keratosis, folliculitis, herpessimplex, herpes zoster, varicella, candidiasis, dermatophytosis,scabies, insect bites, cherry angioma, keloid, dermatofibroma,acrochordons, urticaria, transient acantholytic dermatosis, xerosis,eczema, atopic dermatitis, contact dermatitis, hand eczema, nummulareczema, lichen simplex chronicus, asteatotic eczema, stasis dermatitisand stasis ulceration, seborrheic dermatitis, psoriasis, lichen planus,pityriasis rosea, impetigo, ecthyma, dermatophytosis, tinea versicolor,warts, acne vulgaris, acne rosacea, pemphigus vulgaris, pemphigusfoliaceus, paraneoplastic pemphigus, bulbous pemphigoid, herpesgestationis, dermatitis herpetiformis, linear IgA disease, epidermolysisbullosa acquisita, dermatomyositis, lupus erythematosus, scleroderma andmorphea, erythroderna, alopecia, figurate skin lesions, telangiectasias,hypopigmentation, hyperpigmentation, vesicles/bullae, exanthems,cutaneous drug reactions, papulonodular skin lesions, chronicnon-healing wounds, photosensitivity diseases, epidermolysis bullosasimplex, epidermolytic hyperkeratosis, epidermolytic andnonepidermolytic palmoplantar keratoderma, ichthyosis bullosa ofSiemens, ichthyosis exfoliativa, keratosis palmaris et plantaris,keratosis palmoplantaris, palmoplantar keratoderma, keratosis punctata,Meesmann's corneal dystrophy, pachyonychia congenita, white spongenevus, steatocystoma multiplex, epidermal nevi/epidermolytichyperkeratosis type, monilethrix, trichothiodystrophy, chronichepatitis/cryptogenic cirrhosis, and colorectal hyperplasia; aneurological disorder, such as epilepsy, ischemic cerebrovasculardisease, stroke, cerebral neoplasms, Alzheimer's disease, Pick'sdisease, Huntington's disease, dementia, Parkinson's disease and otherextrapyramidal disorders, amyotrophic lateral sclerosis and other motorneuron disorders, progressive neural muscular atrophy, retinitispigmentosa, hereditary ataxias, multiple sclerosis and otherdemyelinating diseases, bacterial and viral meningitis, brain abscess,subdural empyema, epidural abscess, suppurative intracranialthrombophlebitis, myelitis and radiculitis, viral central nervous systemdisease, prion diseases including kuru, Creutzfeldt-Jakob disease, andGerstmann-Straussler-Scheinker syndrome, fatal familial insomnia,nutritional and metabolic diseases of the nervous system,neurofibromatosis, tuberous sclerosis, cerebelloretinalhemangioblastomatosis, encephalotrigeminal syndrome, mental retardationand other developmental disorders of the central nervous systemincluding Down syndrome, cerebral palsy, neuroskeletal disorders,autonomic nervous system disorders, cranial nerve disorders, spinal corddiseases, muscular dystrophy and other neuromuscular disorders,peripheral nervous system disorders, dermatomyositis and polymyositis,inherited, metabolic, endocrine, and toxic myopathies, myastheniagravis, periodic paralysis, mental disorders including mood, anxiety,and schizophrenic disorders, seasonal affective disorder (SAD),akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia,dystonias, paranoid psychoses, postherpetic neuralgia, Tourette'sdisorder, progressive supranuclear palsy, corticobasal degeneration, andfamilial frontotemporal dementia; and a reproductive disorder, such asinfertility, including tubal disease, ovulatory defects, andendometriosis, a disorder of prolactin production, a disruption of theestrous cycle, a disruption of the menstrual cycle, polycystic ovarysyndrome, ovarian hyperstimulation syndrome, an endometrial or ovariantumor, a uterine fibroid, autoimmune disorders, an ectopic pregnancy,and teratogenesis; cancer of the breast, fibrocystic breast disease, andgalactorrhea; a disruption of spermatogenesis, abnormal spermphysiology, cancer of the testis, cancer of the prostate, benignprostatic hyperplasia, prostatitis, Peyronie's disease, impotence,carcinoma of the male breast, and gynecomastia. The polynucleotidesequences encoding PRTS may be used in Southern or northern analysis,dot blot, or other membrane-based technologies; in PCR technologies; indipstick, pin, and multiformat ELISA-like assays; and in microarraysutilizing fluids or tissues from patients to detect altered PRTSexpression. Such qualitative or quantitative methods are well known inthe art.

[0244] In a particular aspect, the nucleotide sequences encoding PRTSmay be useful in assays that detect the presence of associateddisorders, particularly those mentioned above. The nucleotide sequencesencoding PRTS may be labeled by standard methods and added to a fluid ortissue sample from a patient under conditions suitable for the formationof hybridization complexes. After a suitable incubation period, thesample is washed and the signal is quantified and compared with astandard value. If the amount of signal in the patient sample issignicantly altered in comparison to a control sample then the presenceof altered levels of nucleotide sequences encoding PRTS in the sampleindicates the presence of the associated disorder. Such assays may alsobe used to evaluate the efficacy of a particular therapeutic treatmentregimen in animal studies, in clinical trials, or to monitor thetreatment of an individual patient.

[0245] In order to provide a basis for the diagnosis of a disorderassociated with expression of PRTS, a normal or standard profile forexpression is established This may be accomplished by combining bodyfluids or cell extracts taken from normal subjects, either animal orhuman, with a sequence, or a fragment thereof, encoding PRTS, underconditions suitable for hybridization or amplification Standardhybridization may be quantified by comparing the values obtained fromnormal subjects with values from an experiment in which a known amountof a substantially purified polynucleotide is used. Standard valuesobtained in this manner may be compared with values obtained fromsamples from patients who are symptomatic for a disorder. Deviation fromstandard values is used to establish the presence of a disorder.

[0246] Once the presence of a disorder is established and a treatmentprotocol is initiated, hybridization assays may be repeated on a regularbasis to determine if the level of expression in the patient begins toapproximate that which is observed in the normal subject The resultsobtained from successive assays may be used to show the efficacy oftreatment over a period ranging from several days to months.

[0247] With respect to cancer, the presence of an abnormal amount oftranscript (either under- or overexpressed) in biopsied tissue from anindividual may indicate a predisposition for the development of thedisease, or may provide a means for detecting the disease prior to theappearance of actual clinical symptoms. A more definitive diagnosis ofthis type may allow health professionals to employ preventative measuresor aggressive treatment earlier thereby preventing the development orfurther progression of the cancer.

[0248] Additional diagnostic uses for oligonucleotides designed from thesequences encoding PRTS may involve the use of PCP. These oligomers maybe chemically synthesized, generated enzymatically, or produced invitro. Oligomers will preferably contain a fragment of a polynucleotideencoding PRTS, or a fragment of a polynucleotide complementary to thepolynucleotide encoding PRTS, and will be employed under optimizedconditions for identification of a specific gene or condition. Oligomersmay also be employed under less stringent conditions for detection orquantification of closely related DNA or RNA sequences.

[0249] In a particular aspect, oligonucleotide primers derived from thepolynucleotide sequences encoding PRTS may be used to detect singlenucleotide polymorphisms (SNPs). SNPs are substitutions, insertions anddeletions that are a frequent cause of inherited or acquired geneticdisease in humans. Methods of SNP detection include, but are not limitedto, single-stranded conformation polymorphism (SSCP) and fluorescentSSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from thepolynucleotide sequences encoding PRTS are used to amplify DNA using thepolymerase chain reaction (PCR). The DNA may be derived, for example,from diseased or normal tissue, biopsy samples, bodily fluids, and thelike. SNPs in the DNA cause differences in the secondary and tertiarystructures of PCR products in single-stranded form, and thesedifferences are detectable using gel electrophoresis in non-denaturinggels. In fSCCP, the oligonucleotide primers are fluorescently labeled,which allows detection of the amplimers in high-throughput equipmentsuch as DNA sequencing machines. Additionally, sequence databaseanalysis methods, termed in silico SNP (isSNP), are capable ofidentifying polymorphisms by comparing the sequence of individualoverlapping DNA fragments which assemble into a common consensussequence. These computer-based methods filter out sequence variationsdue to laboratory preparation of DNA and sequencing errors usingstatistical models and automated analyses of DNA sequence chromatograms.In the alternative, SNPs may be detected and characterized by massspectrometry using, for example, the high throughput MASSARRAY system(Sequenom, Inc., San Diego Calif.).

[0250] Methods which may also be used to quantify the expression of PRTSinclude radiolabeling or biotinylating nucleotides, coamplification of acontrol nucleic acid, and interpolating results from standard curves.(See, e.g., MeIby, P. C. et al. (1993) J. Immunol. Methods 159:235-244;Duplaa, C. et al. (1993) Anal. Biochem. 212:229-236.) The speed ofquantitation of multiple samples may be accelerated by running the assayin a high-throughput format where the oligomer or polynucleotide ofinterest is presented in various dilutions and a spectrophotometric orcolorimetric response gives rapid quantitation.

[0251] In further embodiments, oligonucleotides or longer fragmentsderived from any of the polynucleotide sequences described herein may beused as elements on a microarray. The microarray can be used intranscript imaging techniques which monitor the relative expressionlevels of large numbers of genes simultaneously as described below. Themicroarray may also be used to identify genetic variants, mutations, andpolymorphisms. This information may be used to determine gene function,to understand the genetic basis of a disorder, to diagnose a disorder,to monitor progression/regression of disease as a function of geneexpression, and to develop and monitor the activities of therapeuticagents in the treatment of disease. In particular, this information maybe used to develop a pharmacogenomic profile of a patient in order toselect the most appropriate and effective treatment regimen for thatpatient. For example, therapeutic agents which are highly effective anddisplay the fewest side effects may be selected for a patient based onhis/her pharmacogenomic profile.

[0252] In another embodiment, PRTS, fragments of PRTS, or antibodiesspecific for PRTS may be used as elements on a microarray. Themicroarray may be used to monitor or measure protein-proteininteractions, drug-target interactions, and gene expression profiles, asdescribed above.

[0253] A particular embodiment relates to the use of the polynucleotidesof the present invention to generate a transcript image of a tissue orcell type. A transcript image represents the global pattern of geneexpression by a particular tissue or cell type. Global gene expressionpatterns are analyzed by quantifying the number of expressed genes andtheir relative abundance under given conditions and at a given time.(See Seilhamer et al, “Comparative Gene Transcript Analysis,” U.S. Pat.No. 5,840,484, expressly incorporated by reference herein.) Thus atranscript image may be generated by hybridizing the polynucleotides ofthe present invention or their complements to the totality oftranscripts or reverse transcripts of a particular tissue or cell type.In one embodiment, the hybridization takes place in high-throughputformat, wherein the polynucleotides of the present invention or theircomplements comprise a subset of a plurality of elements on amicroarray. The resultant transcript image would provide a profile ofgene activity.

[0254] Transcript images may be generated using transcripts isolatedfrom tissues, cell lines, biopsies, or other biological samples. Thetranscript image may thus reflect gene expression in vivo, as in thecase of a tissue or biopsy sample, or in vitro, as in the case of a cellline.

[0255] Transcript images which profile the expression of thepolynucleotides of the present invention may also be used in conjunctionwith in vitro model systems and preclinical evaluation ofpharmaceuticals, as well as toxicological testing of industrial andnaturally-occurring environmental compounds. All compounds inducecharacteristic gene expression patterns, frequently termed molecularfingerprints or toxicant signatures, which are indicative of mechanismsof action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog.24:153-159; Steiner, S. and N. L. Anderson (2000) Toxicol. Lett.112-113:467-471, expressly incorporated by reference herein). If a testcompound has a signature similar to that of a compound with knowntoxicity, it is likely to share those toxic properties. Thesefingerprints or signatures are most useful and refined when they containexpression information from a large number of genes and gene families.Ideally, a genome-wide measurement of expression provides the highestquality signature. Even genes whose expression is not altered by anytested compounds are important as well, as the levels of expression ofthese genes are used to normalize the rest of the expression data. Thenormalization procedure is useful for comparison of expression dataafter treatment with different compounds. While the assignment of genefunction to elements of a toxicant signature aids in interpretation oftoxicity mechanisms, knowledge of gene function is not necessary for thestatistical matching of signatures which leads to prediction oftoxicity. (See, for example, Press Release 00-02 from the NationalInstitute of Environmental Health Sciences, released Feb. 29, 2000,available at http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore,it is important and desirable in toxicological screening using toxicantsignatures to include all expressed gene sequences.

[0256] In one embodiment, the toxicity of a test compound is assessed bytreating a biological sample containing nucleic acids with the testcompound. Nucleic acids that are expressed in the treated biologicalsample are hybridized with one or more probes specific to thepolynucleotides of the present invention, so that transcript levelscorresponding to the polynucleotides of the present invention may bequantified. The transcript levels in the treated biological sample arecompared with levels in an untreated biological sample. Differences inthe transcript levels between the two samples are indicative of a toxicresponse caused by the test compound in the treated sample.

[0257] Another particular embodiment relates to the use of thepolypeptide sequences of the present invention to analyze the proteomeof a tissue or cell type. The term proteome refers to the global patternof protein expression in a particular tissue or cell type. Each proteincomponent of a proteome can be subjected individually to furtheranalysis. Proteome expression patterns, or profiles, are analyzed byquantifying the number of expressed proteins and their relativeabundance under given conditions and at a given time. A profile of acell's proteome may thus be generated by separating and analyzing thepolypeptides of a particular tissue or cell type. In one embodiment, theseparation is achieved using two dimensional gel electrophoresis, inwhich proteins from a sample are separated by isoelectric focusing inthe first dimension, and then according to molecular weight by sodiumdodecyl sulfate slab gel electrophoresis in the second dimension(Steiner and Anderson, supra The proteins are visualized in the gel asdiscrete and uniquely positioned spots, typically by staining the gelwith an agent such as Coomassie Blue or silver or fluorescent stains.The optical density of each protein spot is generally proportional tothe level of the protein in the sample. The optical densities ofequivalently positioned protein spots from different samples, forexample, from biological samples either treated or untreated with a testcompound or therapeutic agent, are compared to identify any changes inprotein spot density related to the treatment. The proteins in the spotsare partially sequenced using, for example, standard methods employingchemical or enzymatic cleavage followed by mass spectrometry. Theidentity of the protein in a spot may be determined by comparing itspartial sequence, preferably of at least 5 contiguous amino acidresidues, to the polypeptide sequences of the present invention. In somecases, further sequence data may be obtained for definitive proteinidentification.

[0258] A proteomic profile may also be generated using antibodiesspecific for PRTS to quantify the levels of PRTS expression. In oneembodiment, the antibodies are used as elements on a microarray, andprotein expression levels are quantified by exposing the microarray tothe sample and detecting the levels of protein bound to each arrayelement (Lueking, A. et al. (1999) Anal. Biochem. 270:103-111; Mendoze,L. G. et al. (1999) Biotechniques 27:778-788). Detection may beperformed by a variety of methods known in the art, for example, byreacting the proteins in the sample with a thiol- or amino-reactivefluorescent compound and detecting the amount of fluorescence bound ateach array element.

[0259] Toxicant signatures at the proteome level are also useful fortoxicological screening, and should be analyzed in parallel withtoxicant signatures at the transcript level. There is a poor correlationbetween transcript and protein abundances for some proteins in sometissues (Anderson, N. L. and J. Seilhamer (1997) Electrophoresis18:533-537), so proteome toxicant signatures may be useful in theanalysis of compounds which do not significantly affect the transcriptimage, but which alter the proteomic profile. In addition, the analysisof tnanscripts in body fluids is difficult, due to rapid degradation ofmRNA, so proteomic profiling may be more reliable and informative insuch cases.

[0260] In another embodiment, the toxicity of a test compound isassessed by treating a biological sample containing proteins with thetest compound. Proteins that are expressed in the treated biologicalsample are separated so that the amount of each protein can bequantified. The amount of each protein is compared to the amount of thecorresponding protein in an untreated biological sample. A difference inthe amount of protein between the two samples is indicative of a toxicresponse to the test compound in the treated sample. Individual proteinsare identified by sequencing the amino acid residues of the individualproteins and comparing these partial sequences to the polypeptides ofthe present invention.

[0261] In another embodiment, the toxicity of a test compound isassessed by treating a biological sample containing proteins with thetest compound. Proteins from the biological sample are incubated withantibodies specific to the polypeptides of the present invention. Theamount of protein recognized by the antibodies is quantified. The amountof protein in the treated biological sample is compared with the amountin an untreated biological sample. A difference in the amount of proteinbetween the two samples is indicative of a toxic response to the testcompound in the treated sample.

[0262] Microarrays may be prepared, used, and analyzed using methodsknown in the art. (See, e.g., Brennan, T. M. et al. (1995) U.S. Pat. No.5,474.796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA93:10614-10619; Baldeschweiler et al. (1995) PCT applicationWO95/251116; Shalon, D. et al. (1995) PCT application WO95/35505;Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150-2155; andHeller, M. J. et al. (1997) U.S. Pat. No. 5,605,662.) Various types ofmicroarrays are well known and thoroughly described in DNA Microarrays:A Practical Aproach, M. Schena, ed. (1999) Oxford University Press,London, hereby expressly incorporated by reference.

[0263] In another embodiment of the invention, nucleic acid sequencesencoding PRTS may be used to generate hybridization probes useful inmapping the naturally occurring genomic sequence. Either coding ornoncoding sequences may be used, and in some instances, noncodingsequences may be preferable over coding sequences. For example,conservation of a coding sequence among members of a multi-gene familymay potentially cause undesired cross hybridization during chromosomalmapping. The sequences may be mapped to a particular chromosome, to aspecific region of a chromosome, or to artificial chromosomeconstructions, e.g., human artificial chromosomes (HACs), yeastartificial chromosomes (YACs), bacterial artificial chromosomes (BACs),bacterial P1 constructions, or single chromosome cDNA libraries. (See,e.g., Harrington, J. J. et al. (1997) Nat. Genet. 15:345-355; Price, C.M. (1993) Blood Rev. 7:127-134; and Trask, B. J. (1991) Trends Genet.7:149-154.) Once mapped, the nucleic acid sequences of the invention maybe used to develop genetic linkage maps, for example, which correlatethe inheritance of a disease state with the inheritance of a particularchromosome region or restriction fragment length polymorphism (RFLP).(See, for example, Lander, E. S. and D. Botstein (1986) Proc. Natl.Acad. Sci. USA 83:7353-7357.)

[0264] Fluorescent in situ hybridization (FISH) may be correlated withother physical and genetic map data. (See, e.g., Heinz-Ulrich, et al.(1995) in Meyers, supra, pp. 965-968.) Examples of genetic map data canbe found in various scientific journals or at the Online MendelianInheritance in Man (OMIM) World Wide Web site. Correlation between thelocation of the gene encoding PRTS on a physical map and a specificdisorder, or a predisposition to a specific disorder, may help definethe region of DNA associated with that disorder and thus may furtherpositional cloning efforts.

[0265] In situ hybridization of chromosomal preparations and physicalmapping techniques, such as linkage analysis using establishedchromosomal markers, may be used for extending genetic maps. Often theplacement of a gene on the chromosome of another mammalian species, suchas mouse, may reveal associated markers even if the exact chromosomallocus is not known. This information is valuable to investigatorssearching for disease genes using positional cloning or other genediscovery techniques. Once the gene or genes responsible for a diseaseor syndrome have been crudely localized by genetic linkage to aparticular genomic region, e.g., ataxia-telangiectasia to 11q22-23, anysequences mapping to that area may represent associated or regulatorygenes for further investigation. (See, e.g., Gatti, R. A. et al. (1988)Nature 336:577-580.) The nucleotide sequence of the instant inventionmay also be used to detect differences in the chromosomal location dueto translocation, inversion, etc., among normal, carrier, or affectedindividuals.

[0266] In another embodiment of the invention, PRTS, its catalytic orimmunogenic fragments, or oligopeptides thereof can be used forscreening libraries of compounds in any of a variety of drug screeningtechniques. The fragment employed in such screening may be free insolution, affixed to a solid support, borne on a cell surface, orlocated intracellularly. The formation of binding complexes between PRTSand the agent being tested may be measured.

[0267] Another technique for drug screening provides for high throughputscreening of compounds having suitable binding affinity to the proteinof interest. (See, e.g., Geysen, et al. (1984) PCT applicationWO84/03564.) In this method, large numbers of different small testcompounds are synthesized on a solid substrate. The test compounds arereacted with PRTS, or fragments thereof, and washed. Bound PRTS is thendetected by methods well known in the art. Purified PRTS can also becoated directly onto plates for use in the aforementioned drug screeningtechniques. Alternatively, non-neutralizing antibodies can be used tocapture the peptide and immobilize it on a solid support.

[0268] In another embodiment, one may use competitive drug screeningassays in which neutralizing antibodies capable of binding PRTSspecifically compete with a test compound for binding PRTS. In thismanner, antibodies can be used to detect the presence of any peptidewhich shares one or more antigenic determinants with PRTS.

[0269] In additional embodiments, the nucleotide sequences which encodePRTS may be used in any molecular biology techniques that have yet to bedeveloped, provided the new techniques rely on properties of nucleotidesequences that are currently known, including, but not limited to, suchproperties as the triplet genetic code and specific base pairinteractions.

[0270] Without further elaboration, it is believed that one skilled inthe art can, using the preceding description, utilize the presentinvention to its fullest extent. The following embodiments are,therefore, to be construed as merely illustrative, and not limitative ofthe remainder of the disclosure in any way whatsoever.

[0271] The disclosures of all patents, applications, and publicationsmentioned above and below, including U.S. Ser. No. 60/231,039, U.S. Ser.No. 60/232,812, U.S. Ser. No. 60/234,850, U.S. Ser. No. 60/236,500, U.S.Ser. No. 60/238,773, and U.S. Ser. No. 60/239,658, are hereby expresslyincorporated by reference.

EXAMPLES I. Construction of cDNA Libraries

[0272] Incyte cDNAs were derived from cDNA libraries described in theLIFESEQ GOLD database (Incyte Genomics, Palo Alto Calif.) and shown inTable 4, column 5. Some tissues were homogenized and lysed inguanidinium isothiocyanate, while others were homogenized and lysed inphenol or in a suitable mixture of denaturants, such as TRIZOL (LifeTechnologies), a monophasic solution of phenol and guanidineisothiocyanate. The resulting lysates were centrifuged over CsClcushions or extracted with chloroform. RNA was precipitated from thelysates with either isopropanol or sodium acetate and ethanol, or byother routine methods.

[0273] Phenol extraction and precipitation of RNA were repeated asnecessary to increase RNA purity. In some cases, RNA was treated withDNase. For most libraries, poly(A)+ RNA was isolated using oligod(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles(QIAGEN, Chatsworth Calif.), or an OLIGOTEX mRNA purification kit(QIAGEN). Alternatively, RNA was isolated directly from tissue lysatesusing other RNA isolation kits, e.g., the POLY(A)PURE mRNA purificationkit (Ambion, Austin Tex.).

[0274] In some cases, Stratagene was provided with RNA and constructedthe corresponding cDNA libraries. Otherwise, cDNA was synthesized andcDNA libraries were constructed with the UNIZAP vector system(Stratagene) or SUPERSCRIPT plasmid system (Life Technologies), usingthe recommended procedures or similar methods known in the at (See,e.g., Ausubel, 1997, supra, units 5.1-6.6.) Reverse transcription wasinitiated using oligo d(T) or random primers. Synthetic oligonucleotideadapters were ligated to double stranded cDNA, and the cDNA was digestedwith the appropriate restriction enzyme or enzymes. For most libraries,the cDNA was size-selected (300-1000 bp) using SEPHACRYL S1000,SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (AmershamPharmacia Biotech) or preparative agarose gel electrophoresis. cDNAswere ligated into compatible restriction enzyme sites of the polylinkerof a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORT1plasmid (Life Technologies), PCDNA2.1 plasmid (invitrogen, CarlsbadCalif.), PBK-CMV plasmid (Stratagene), PCR2-TOPOTA (Invitrogen),PCMV-ICIS (Stratagene), or pINCY (Incyte Genomics, Palo Alto Calif.), orderivatives thereof. Recombinant plasmids were transformed intocompetent E. coli cells including XL1-Blue, XL1-BlueMRF, or SOLR fromStratagene or DH5α, DH10B, or ElectroMAX DH10B from Life Technologies.

II. Isolation of cDNA Clones

[0275] Plasmids obtained as described in Example I were recovered fromhost cells by in vivo excision using the UNIZAP vector system(Stratagene) or by cell lysis. Plasmids were purified using at least oneof the following: a Magic or WIZARD Minipreps DNA purification system(Promega); an AGTC Miniprep purification kit (Edge Biosystems,Gaithersburg Md.); and QIAWEIL 8 Plasmid, QIAWELL 8 Plus Plasmid,QIAWELL 8 Ultra Plasmid purification systems or the R.E.A.L. PREP 96plasmid purification kit from QIAGEN. Following precipitation, plasmidswere resuspended in 0.1 ml of distilled water and stored, with orwithout lyophilization, at 4° C.

[0276] Alternatively, plasmid DNA was amplified from host cell lysatesusing direct link PCR in a high-throughput format (Rao, V. B. (1994)Anal. Biochem. 216:1-14). Host cell lysis and thermal cycling steps werecarried out in a single reaction mixture. Samples were processed andstored in 384-well plates, and the concentration of amplified plasmidDNA was quantified fluorometrically using PICOGREEN dye (MolecularProbes, Eugene Oreg.) and a FLUOROSKAN II fluorescence scanner(Labsystems Oy, Helsinki, Finland).

III. Sequencing and Analysis

[0277] Incyte cDNA recovered in plasmids as described in Example II weresequenced as follows. Sequencing reactions were processed using standardmethods or high-throughput instrumentation such as the ABI CATALYST 800(Applied Biosystems) thermal cycler or the PTC-200 thermal cycler (MJResearch) in conjunction with the HYDRA microdispenser (RobbinsScientific) or the MICROLAB 2200 Hamilton) liquid transfer system. cDNAsequencing reactions were prepared using reagents provided by AmershamPharmacia Biotech or supplied in ABI sequencing kits such as the ABIPRISM BIGDYE Terminator cycle sequencing ready reaction kit (AppliedBiosystems). Electrophoretic separation of cDNA sequencing reactions anddetection of labeled polynucleotides were carried out using the MEGABACE1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or377 sequencing system (Applied Biosystems) in conjunction with standardABI protocols and base calling software; or other sequence analysissystems known in the art. Reading frames within the cDNA sequences wereidentified using standard methods (reviewed in Ausubel, 1997, supra,unit 7.7). Some of the cDNA sequences were selected for extension usingthe techniques disclosed in Example VIII.

[0278] The polynucleotide sequences derived from Incyte cDNAs werevalidated by removing vector, linker, and poly(A) sequences and bymasking ambiguous bases, using algorithms and programs based on BLAST,dynamic programming, and dinucleotide nearest neighbor analysis. TheIncyte cDNA sequences or translations thereof were then queried againsta selection of public databases such as the GenBank primate, rodent,mammalian, vertebrate, and eukaryote databases, and BLOCKS, PRINTS,DOMO, PRODOM, and hidden Markov model (HMM)-based protein familydatabases such as PFAM. (HMM is a probabilistic approach which analyzesconsensus primary structures of gene families. See, for example, Eddy,S. R. (1996) Curr. Opin. Struct. Biol. 6:361-365.) The queries wereperformed using programs based on BLAST, FASTA, BLIMPS, and HMMER. TheIncyte cDNA sequences were assembled to produce full lengthpolynucleotide sequences. Alternatively, GenBank cDNAs, GenBank ESTs,stitched sequences, stretched sequences, or Genscan-predicted codingsequences (see Examples IV and V) were used to extend Incyte cDNAassemblages to full length. Assembly was performed using programs basedon Phred, Phrap, and Consed, and cDNA assemblages were screened for openreading frames using programs based on GeneMark, BLAST, and FASTA. Thefull length polynucleotide sequences were translated to derive thecorresponding fall length polypeptide sequences. Alternatively, apolypeptide of the invention may begin at any of the methionine residuesof the full length translated polypeptide. Full length polypeptidesequences were subsequently analyzed by querying against databases suchas the GenBank protein databases (genpept), SwissProt, BLOCKS, PRINTS,DOMO, PRODOM, Prosite, and hidden Markov model (HMM)-based proteinfamily databases such as PFAM. Full length polynucleotide sequences arealso analyzed using MACDNASIS PRO software (Hitachi SoftwareEngineering, South San Francisco Calif.) and LASERGENE software(DNASTAR). Polynucleotide and polypeptide sequence alignments aregenerated using default parameters specified by the CLUSTAL algorithm asincorporated into the MEGALIGN multisequence alignment program(DNASTAR), which also calculates the percent identity between alignedsequences.

[0279] Table 7 summarizes the tools, programs, and algorithms used forthe analysis and assembly of Incyte cDNA and full length sequences andprovides applicable descriptions, references, and threshold parameters.The first column of Table 7 shows the tools, programs, and algorithmsused, the second column provides brief descriptions thereof, the thirdcolumn presents appropriate references, all of which are incorporated byreference herein in their entirety, and the fourth column presents,where applicable, the scores, probability values, and other parametersused to evaluate the strength of a match between two sequences (thehigher the score or the lower the probability value, the greater theidentity between two sequences).

[0280] The programs described above for the assembly and analysis offull length polynucleotide and polypeptide sequences were also used toidentify polynucleotide sequence fragments from SEQ ID NO: 18-34.Fragments from about 20 to about 4000 nucleotides which are useful inhybridization and amplification technologies are described in Table 4,column 4.

IV. Identification and Editing of Coding Sequences from Genomic DNA

[0281] Putative proteases were initially identified by running theGenscan gene identification program against public genomic sequencedatabases (e.g., gbpri and gbhtg). Genscan is a general-purpose geneidentification program which analyzes genomic DNA sequences from avariety of organisms (See Burge, C. and S. Karlin (1997) J. Mol. Biol.268:78-94, and Burge, C. and S. Karlin (1998) Curr. Opin. Struct. Biol.8:346-354). The program concatenates predicted exons to form anassembled cDNA sequence extending from a methionine to a stop codon Theoutput of Genscan is a FASTA database of polynucleotide and polypeptidesequences. The maximum range of sequence for Genscan to analyze at oncewas set to 30 kb. To determine which of these Genscan predicted cDNAsequences encode proteases, the encoded polypeptides were analyzed byquerying against PFAM models for proteases. Potential proteases werealso identified by homology to Incyte cDNA sequences that had beenannotated as proteases. These selected Genscan-predicted sequences werethen compared by BLAST analysis to the genpept and gbpri publicdatabases. Where necessary, the Genscan-predicted sequences were thenedited by comparison to the top BLAST hit from genpept to correct errorsin the sequence predicted by Genscan, such as extra or omitted exons.BLAST analysis was also used to find any Incyte cDNA or public cDNAcoverage of the Genscan-predicted sequences, thus providing evidence fortranscription. When Incyte cDNA coverage was available, this informationwas used to correct or confirm the Genscan predicted sequence. Fulllength polynucleotide sequences were obtained by assemblingGenscan-predicted coding sequences with Incyte cDNA sequences and/orpublic cDNA sequences using the assembly process described in ExampleIII. Alternatively, full length polynucleotide sequences were derivedentirely from edited or unedited Genscan-predicted coding sequences.

V. Assembly of Genomic Sequence Data with CDNA Sequence Data

[0282] “Stitched” Sequences

[0283] Partial cDNA sequences were extended with exons predicted by theGenscan gene identification program described in Example IV. PartialcDNAs assembled as described in Example III were mapped to genomic DNAand parsed into clusters containing related cDNAs and Genscan exonpredictions from one or more genomic sequences. Each cluster wasanalyzed using an algorithm based on graph theory and dynamicprogramming to integrate cDNA and genomic information, generatingpossible splice variants that were subsequently confirmed, edited, orextended to create a full length sequence. Sequence intervals in whichthe entire length of the interval was present on more than one sequencein the cluster were identified, and intervals thus identified wereconsidered to be equivalent by transitivity. For example, if an intervalwas present on a cDNA and two genomic sequences, then al three intervalswere considered to be equivalent This process allows unrelated butconsecutive genomic sequences to be brought together, bridged by cDNAsequence. Intervals thus identified were then “stitched” together by thestitching algorithm in the order that they appear along their parentsequences to generate the longest possible sequence, as well as sequencevariants. Linkages between intervals which proceed along one type ofparent sequence (cDNA to cDNA or genomic sequence to genomic sequence)were given preference over linkages which change parent type (cDNA togenomic sequence). The resultant stitched sequences were translated andcompared by BLAST analysis to the genpept and gbpri public databases.Incorrect exons predicted by Genscan were corrected by comparison to thetop BLAST hit from genpept Sequences were further extended withadditional cDNA sequences, or by inspection of genomic DNA, whennecessary.

[0284] “Stretched” Sequences

[0285] Partial DNA sequences were extended to full length with analgorithm based on BLAST analysis. First, partial cDNAs assembled asdescribed in Example III were queried against public databases such asthe GenBank primate, rodent, mammalian, vertebrate, and eukaryotedatabases using the BLAST program. The nearest GenBank protein homologwas then compared by BLAST analysis to either Incyte cDNA sequences orGenScan exon predicted sequences described in Example IV. A chimericprotein was generated by using the resultant high-scoring segment pairs(HSPs) to map the translated sequences onto the GenBank protein homolog.Insertions or deletions may occur in the chimeric protein with respectto the original GenBank protein homolog. The GenBank protein homolog,the chimeric protein, or both were used as probes to search forhomologous genomic sequences from the public human genome databases.Partial DNA sequences were therefore “stretched” or extended by theaddition of homologous genomic sequences. The resultant stretchedsequences were examined to determine whether it contained a completegene.

VI. Chromosomal Mapping of PRTS Encoding Polynucleotides

[0286] The sequences which were used to assemble SEQ ID NO: 18-34 werecompared with sequences from the Incyte LIFESEQ database and publicdomain databases using BLAST and other implementations of theSmith-Waterman algorithm Sequences from these databases that matched SEQID NO: 18-34 were assembled into clusters of contiguous and overlappingsequences using assembly algorithms such as Phrap (Table 7). Radiationhybrid and genetic mapping data available from public resources such asthe Stanford Human Genome Center (SHGC), Whitehead Institute for GenomeResearch (WIGR), and Généthon were used to determine if any of theclustered sequences had been previously mapped. Inclusion of a mappedsequence in a cluster resulted in the assignment of all sequences ofthat cluster, including its particular SEQ ID NO:, to that map location.

[0287] Map locations are represented by ranges, or intervals, of humanchromosomes. The map position of an interval, in centiMorgans, ismeasured relative to the terminus of the chromosome's p-arm. (ThecentiMorgan (cM) is a unit of measurement based on recombinationfrequencies between chromosomal markers. On average, 1 cM is roughlyequivalent to 1 megabase (Mb) of DNA in humans, although this can varywidely due to hot and cold spots of recombination.) The cM distances arebased on genetic markers mapped by Généthon which provide boundaries forradiation hybrid markers whose sequences were included in each of theclusters. Human genome maps and other resources available to the public,such as the NCBI “GeneMap'99” World Wide Web site(http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine ifpreviously identified disease genes map within or in proximity to theintervals indicated above.

[0288] In this manner, SEQ ID NO: 18 was mapped to chromosome 16 withinthe interval from 33.4 to 42.7 centiMorgans. In this manner, SEQ ID NO:22 was mapped to chromosome 1 within the interval from 219.2 to 222.7centiMorgans.

VII. Analysis of Polynucleotide Expression

[0289] Northern analysis is a laboratory technique used to detect thepresence of a transcript of a gene and involves the hybridization of alabeled nucleotide sequence to a membrane on which RNAs from aparticular cell type or tissue have been bound (See, e.g., Sambrook,supra, ch. 7; Ausubel (1995) supra, cb, 4 and 16.)

[0290] Analogous computer techniques applying BLAST were used to searchfor identical or related molecules in cDNA databases such as GenBank orLIFESEQ (Incyte Genomics). This analysis is much faster than multiplemembrane-based hybridizations. In addition, the sensitivity of thecomputer search can be modified to determine whether any particularmatch is categorized as exact or similar. The basis of the search is theproduct score, which is defined as:$\frac{{BLAST}\quad {Score} \times {Percent}\quad {Identity}}{5 \times {minimum}\quad \left\{ {{{length}\left( {{Seq}.\quad 1} \right)},{{length}\left( {{Seq}.\quad 2} \right)}} \right\}}$

[0291] The product score takes into account both the degree ofsimilarity between two sequences and the length of the sequence matchThe product score is a normalized value between 0 and 100, and iscalculated as follows: the BLAST score is multiplied by the percentnucleotide identity and the product is divided by (5 times the length ofthe shorter of the two sequences). The BLAST score is calculated byassigning a score of +5 for every base that matches in a high-scoringsegment pair (HSP), and −4 for every mismatch. Two sequences may sharemore than one HSP (separated by gaps). If ere is more than one HSP, thenthe pair with the highest BLAST score is used to calculate the productscore. The product score represents a balance between fractional overlapand quality in a BLAST alignment For example, a product score of 100 isproduced only for 100% identity over the entire length of the shorter ofthe two sequences being compared. A product score of 70 is producedeither by 100% identity and 70% overlap at one end, or by 88% identityand 100% overlap at the other. A product score of 50 is produced eitherby 100% identity and 50% overlap at one end, or 79% identity and 100%overlap.

[0292] Alternatively, polynucleotide sequences encoding PRTS areanalyzed with respect to the tissue sources from which they werederived. For example, some full length sequences are assembled, at leastin part, with overlapping Incyte cDNA sequences (see Example III). EachcDNA sequence is derived from a cDNA library constructed from a humantissue. Each human tissue is classified into one of the followingorgan/tissue categories: cardiovascular system; connective tissue;digestive system; embryonic structures; endocrine system; exocrineglands; genitalia, female; genitalia, male; germ cells; hemic and immunesystem; liver; musculoskeletal system; nervous system; pancreas;respiratory system; sense organs; skin; stomatognathic system;unclassified/mixed; or urinary tract. The number of libraries in eachcategory is counted and divided by the total number of libraries acrossall categories. Similarly, each human tissue is classified into one ofthe following disease/condition categories: cancer, cell line,developmental, inflammation, neurological, trauma, cardiovascular,pooled, and other, and the number of libraries in each category iscounted and divided by the total number of libraries across allcategories. The resulting percentages reflect the tissue- anddisease-specific expression of cDNA encoding PRTS. cDNA sequences andcDNA library/tissue information are found in the LIFESEQ GOLD database(Incyte Genomics, Palo Alto Calif.).

VIII. Extension of PRTS Encoding Polynucleotides

[0293] Pull length polynucleotide sequences were also produced byextension of an appropriate fragment of the full length molecule usingoligonucleotide primers designed from this fragment. One primer wassynthesized to initiate 5′ extension of the known fragment, and theother primer was synthesized to initiate 3′ extension of the knownfragment. The initial primers were designed using OLIGO 4.06 software(National Biosciences), or another appropriate program, to be about 22to 30 nucleotides in length, to have a GC content of about 50% or more,and to anneal to the target sequence at temperatures of about 68° C. toabout 72° C. Any stretch of nucleotides which would result in hairpinstructures and primer-primer dimerizations was avoided.

[0294] Selected human cDNA libraries were used to extend the sequence.If more than one extension was necessary or desired, additional ornested sets of primers were designed.

[0295] High fidelity amplification was obtained by PCR using methodswell known in the art. PCR was performed in 96-well plates using thePTC-200 thermal cycler (MJ Research, Inc.). The reaction mix containedDNA template, 200 nmol of each primer, reaction buffer containing Mg²⁺,(NH₄)₂SO₄, and 2-mercaptoethanol, Taq DNA polymerase (Amersham PharmaciaBiotech), ELONGASE enzyme (Life Technologies), and Pfu DNA polymerase(Stratagene), with the following parameters for primer pair PCI A andPCI B: Step 1: 94° C., 3 min; Step 2: 94° C., 15 sec; Step 3: 60° C., 1min; Step 4: 68° C., 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times;Step 6: 68° C., 5 min; Step 7: storage at 4° C. In the alternative, theparameters for primer pair T7 and SK+ were as follows: Step 1: 94° C., 3min; Step 2: 94° C., 15 sec; Step 3: 57° C., 1min; Step 4: 68° C., 2min;Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68° C., 5 min; Step7: storage at 4° C.

[0296] The concentration of DNA in each well was determined bydispensing 100 μl PICOGREEN quantitation reagent (0.25% (v/v) PICOGREEN;Molecular Probes, Eugene Oreg.) dissolved in 1×TE and 0.5 μl ofundiluted PCR product into each well of an opaque fluorimeter plate(Corning Costar, Acton Mass.), allowing the DNA to bind to the reagent.The plate was scanned in a Fluoroskan II (Labsystems Oy, Helsinki,Finland) to measure the fluorescence of the sample and to quantify theconcentration of DNA. A 5 μl to 10 μl aliquot of the reaction mixturewas analyzed by electrophoresis on a 1% agarose gel to determine whichreactions were successful in extending the sequence.

[0297] The extended nucleotides were desalted and concentrated,transferred to 384-well plates, digested with CviJI cholera virusendonuclease (Molecular Biology Research, Madison Wis.), and sonicatedor sheared prior to religation into pUC 18 vector (Amersham PharmaciaBiotech). For shotgun sequencing, the digested nucleotides wereseparated on low concentration (0.6 to 0.8%) agarose gels, fragmentswere excised, and agar digested with Agar ACE (Promega). Extended cloneswere religated using T4 ligase (New England Biolabs, Beverly Mass.) intopUC 18 vector (Amersham Pharmacia Biotech), treated with Pfu DNApolymerase (Stratagene) to fill-in restriction site overhangs, andtransfected into competent E. coli cells. Transformed cells wereselected on antbiotic-containing media, and individual colonies werepicked and cultured overnight at 37° C. in 384-well plates in LB/2× carbliquid media.

[0298] The cells were lysed, and DNA was amplified by PCR using Taq DNApolymerase (Amersham Pharmacia Biotech) and Pfu DNA polymerase(Stratagene) with the following parameters: Step 1: 94° C., 3 min; Step2: 94° C., 15 sec; Step 3: 60° C., 1 min; Step 4: 72° C., 2 min; Step 5:steps 2, 3, and 4 repeated 29 times; Step 6: 72° C., 5 min; Step 7:storage at 4° C. DNA was quantified by PICOGREEN reagent (MolecularProbes) as described above. Samples with low DNA recoveries werereamplified using the same conditions as described above. Samples werediluted with 20% dimethysulfoxide (1:2, v/v), and sequenced usingDYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit(Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cyclesequencing ready reaction kit (Applied Biosystems).

[0299] In like manner, full length polynucleotide sequences are verifiedusing the above procedure or are used to obtain 5′ regulatory sequencesusing the above procedure along with oligonucleotides designed for suchextension, and an appropriate genomic library.

[0300] IX. Labeling and Use of Individual Hybridization Probes

[0301] Hybridization probes derived from SEQ ID NO: 18-34 are employedto screen cDNAs, genomic DNAs, or mRNAs. Although the labeling ofoligonucleotides, consisting of about 20 base pairs, is specificallydescribed, essentially the same procedure is used with larger nucleotidefragments. Oligonucleotides are designed using state-of-the-art softwaresuch as OLIGO 4.06 software (National Biosciences) and labeled bycombining 50 pmol of each oligomer, 250 μCi of [γ-³²P] adenosinetriphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase(DuPont NEN, Boston Mass.). The labeled oligonucleotides aresubstantially purified using a SEPHADEX G-25 superfine size exclusiondextran bead column (Amersham Pharmacia Biotech). An aliquot containing10⁷ counts per minute of the labeled probe is used in a typicalmembrane-based hybridization analysis of human genomic DNA digested withone of the following endonucleases: Ase I, Bgl II, Eco RI, Pst I, Xba I,or Pvu II (DuPont NEN).

[0302] The DNA from each digest is fractionated on a 0.7% agarose geland transferred to nylon membranes (Nytran Plus, Schleicher & SchuellDurham N.H). Hybridization is carried out for 16 hours at 40° C. Toremove nonspecific signals, blots are sequentially washed at roomtemperature under conditions of up to, for example, 0.1 x saline sodiumcitrate and 0.5% sodium dodecyl sulfate. Hybridization patterns arevisualized using autoradiography or an alternative imaging means andcompared.

X. Microarrays

[0303] The linkage or synthesis of array elements upon a microarray canbe achieved utilizing photolithography, piezoelectric printing (ink-jetprinting, See, e.g., Baldeschweiler, supra), mechanical microspottingtechnologies, and derivatives thereof. The substrate in each of theaforementioned technologies should be uniform and solid with anon-porous surface (Schena (1999), supra). Suggested substrates includesilicon, silica, glass slides, glass chips, and silicon wafers.Alternatively, a procedure analogous to a dot or slot blot may also beused to arrange and link elements to the surface of a substrate usingthermal, UV, chemical, or mechanical bonding procedures. A typical arraymay be produced using available methods and machines well known to thoseof ordinary skill in the art and may contain any appropriate number ofelements. (See, e.g., Schena. M. et al. (1995) Science 270:467-470;Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J.Hodgson (1998) Nat. Biotechnol. 16:27-31.)

[0304] Full length cDNAs, Expressed Sequence Tags (ESTs), or fragmentsor oligomers thereof may comprise the elements of the microarray.Fragments or oligomers suitable for hybridization can be selected usingsoftware well known in the art such as LASERGENE software (DNASTAR). Thearray elements are hybridized with polynucleotides in a biologicalsample. The polynucleotides in the biological sample are conjugated to afluorescent label or other molecular tag for ease of detection. Afterhybridization, nonhybridized nucleotides from the biological sample areremoved, and a fluorescence scanner is used to detect hybridization ateach array element. Alternatively, laser desorbtion and massspectrometry may be used for detection of hybridization. The degree ofcomplementarity and the relative abundance of each polynucleotide whichhybridizes to an element on the microarray may be assessed. In oneembodiment, microarray preparation and usage is described in detailbelow.

[0305] Tissue or Cell Sample Preparation

[0306] Total RNA is isolated from tissue samples using the guanidiniumthiocyanate method and poly(A)⁺ RNA is purified using the oligo-(dT)cellulose method. Each poly(A)⁺ RNA sample is reverse transcribed usingMMLV reverse-transcriptase, 0.05 pg/μl oligo-(dT) primer (21mer),1×first strand buffer, 0.03 units/μl RNase inhibitor, 500 μM dATP, 500μM dGTP, 500 μM dTTP, 40 μM dCTP, 40 μM dCTP-Cy3 (BDS) or dCTP-Cy5(Amersham Pharmacia Biotech). The reverse transcription reaction isperformed in a 25 ml volume containing 200 ng poly(A)⁺ RNA withGEMBRIGHT kits (Incyte). Specific control poly(A)⁺ RNAs are synthesizedby in vitro transcription from non-coding yeast genomic DNA. Afterincubation at 37° C. for 2 hr, each reaction sample (one with Cy3 andanother with Cy5 labeling) is treated with 2.5 ml of 0.5M sodiumhydroxide and incubated for 20 minutes at 85° C. to the stop thereaction and degrade the RNA. Samples are purified using two successiveCHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc.(CLONTECH), Palo Alto Calif.) and after combining, both reaction samplesare ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodiumacetate, and 300 ml of 100% ethanol. The sample is then dried tocompletion using a SpeedVAC (Savant Instruments Inc., Holbrook N.Y.) andresuspended in 14 μl 5×SSC/0.2% SDS.

[0307] Microarray Preparation

[0308] Sequences of the present invention are used to generate arrayelements. Each array element is amplified from bacterial cellscontaining vectors with cloned cDNA inserts. PCR amplification usesprimers complementary to the vector sequences flaking the cDNA insert.Array elements are amplified in thirty cycles of PCR from an initialquantity of 1-2 ng to a final quantity greater than 5 μg. Amplifiedarray elements are then purified using SEPHACRYL-400 (Amersham PharmaciaBiotech).

[0309] Purified array elements are inmmobilized on polymer-coated glassslides. Glass microscope slides (Corning) are cleaned by ultrasound in0.1% SDS and acetone, with extensive distilled water washes between andafter treatments. Glass slides are etched in 4% hydrofluoric acid (VWRScientific Products Corporation (VWR), West Chester Pa.), washedextensively in distilled water, and coated with 0.05% aminopropyl silane(Sigma) in 95% ethanol. Coated slides are cured in a 110° C. oven.

[0310] Array elements are applied to the coated glass substrate using aprocedure described in U.S. Pat. No. 5,807,522, incorporated herein byreference. 1 μl of the array element DNA, at an average concentration of100 ng/μl, is loaded into the open capillary printing element by ahigh-speed robotic apparatus. The apparatus then deposits about 5 nl ofarray element sample per slide.

[0311] Microarrays are UV-crosslinked using a STRATALINKERUV-crosslinker (Stratagene). Microarrays are washed at room temperatureonce in 0.2% SDS and three times in distilled water. Non-specificbinding sites are blocked by incubation of microarrays in 0.2% casein inphosphate buffered saline (PBS) (Tropix, Inc., Bedford Mass.) for 30minutes at 60° C. followed by washes in 0.2% SDS and distilled water asbefore.

[0312] Hybridization

[0313] Hybridization reactions contain 9 μl of sample mixture consistingof 0.2 μg each of Cy3 and Cy5 labeled cDNA synthesis products in 5×SSC,0.2% SDS hybridization buffer. The sample mixture is heated to 65° C.for 5 minutes and is aliquoted onto the microarray surface and coveredwith an 1.8 cm² coverslip. The arrays are transferred to a waterproofchamber having a cavity just slightly larger than a microscope slide.The chamber is kept at 100% humidity internally by the addition of 140μl of 5×SSC in a corner of the chamber. The chamber containing thearrays is incubated for about 6.5 hours at 60° C. The arrays are washedfor 10 min at 45° C. in a first wash buffer (0.1×SSC, 0.1% SDS), threetimes for 10 minutes each at 45° C. in a second wash buffer (0.1×SSC),and dried.

[0314] Detection

[0315] Reporter-labeled hybridization complexes are detected with amicroscope equipped with an Innova 70 mixed gas 10 W laser (Coherent,Inc., Santa Clara Calif.) capable of generating spectral lines at 488 nmfor excitation of Cy3 and at 632 nm for excitation of Cy5. Theexcitation laser light is focused on the array using a 20× microscopeobjective (Nikon, Inc., Melville N.Y.). The slide containing the arrayis placed on a computer-controlled X-Y stage on the microscope andraster-scanned past the objective. The 1.8 cm×1.8 cm array used in thepresent example is scanned with a resolution of 20 micrometers.

[0316] In two separate scans, a mixed gas multiline laser excites thetwo fluorophores sequentially. Emitted light is split, based onwavelength, into two photomultiplier tube detectors (PMT R1477,Hamamatsu Photonics Systems, Bridgewater N.J.) corresponding to the twofluorophores. Appropriate filters positioned between the array and thephotomultiplier tubes are used to filter the signals. The emissionmaxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5.Each array is typically scanned twice, one scan per fluorophore usingthe appropriate filters at the laser source, although the apparatus iscapable of recording the spectra from both fluorophores simultaneously:

[0317] The sensitivity of the scans is typically calibrated using thesignal intensity generated by a cDNA control species added to the samplemixture at a known concentration. A specific location on the arraycontains a complementary DNA sequence, allowing the intensity of thesignal at that location to be correlated with a weight ratio ofhybridizing species of 1:100,000. When two samples from differentsources (e.g., representing test and control cells), each labeled with adifferent fluorophore, are hybridized to a single array for the purposeof identifying genes that are differentially expressed, the calibrationis done by labeling samples of the calibrating cDNA with the twofluorophores and adding identical amounts of each to the hybridizationmixture.

[0318] The output of the photomultiplier tube is digitized using a12-bit RTI-835H analog-to-digital (A/D) conversion board (AnalogDevices, Inc., Norwood Mass.) installed in an IBM-compatible PCcomputer. The digitized data are displayed as an image where the signalintensity is mapped using a linear 20-color transformation to apseudocolor scale ranging from blue (low signal) to red (high signal).The data is also analyzed quantitatively. Where two differentfluorophores are excited and measured simultaneously, the data are firstcorrected for optical crosstalk (due to overlapping emission spectra)between the fluorophores using each fluorophore's emission spectrum.

[0319] A grid is superimposed over the fluorescence signal image suchthat the signal from each spot is centered in each element of the grid.The fluorescence signal within each element is then integrated to obtaina numerical value corresponding to the average intensity of the signal.The software used for signal analysis is the GEMTOOLS gene expressionanalysis program (Incyte).

XI. Complementary Polynucleotides

[0320] Sequences complementary to the PRTS-encoding sequences, or anyparts thereof, are used to detect, decrease, or inhibit expression ofnaturally occurring PRTS. Although use of oligonucleotides comprisingfrom about 15 to 30 base pairs is described, essentially the sameprocedure is used with smaller or with larger sequence fragments.Appropriate oligonucleotides are designed using OLIGO 4.06 software(National Biosciences) and the coding sequence of PRTS. To inhibittranscription, a complementary oligonucleotide is designed from the mostunique 5′ sequence and used to prevent promoter binding to the codingsequence. To inhibit translation, a complementary oligonucleotide isdesigned to prevent ribosomal binding to the PRTS-encoding transcript.

[0321] XII. Expression of PRTS

[0322] Expression and purification of PRTS is achieved using bacterialor virus-based expression systems. For expression of PRTS in bacteria,cDNA is subcloned into an appropriate vector containing an antibioticresistance gene and an inducible promoter that directs high levels ofcDNA transcription. Examples of such promoters include, but are notlimited to, the trp-lac (tac) hybrid promoter and the T5 or T7bacteriophage promoter in conjunction with the lac operator regulatoryelement Recombinant vectors are transformed into suitable bacterialhosts, e.g., BL21(DE3). Antibiotic resistant bacteria express PRTS uponinduction with isopropyl beta-D-thiogalactopyranoside (IPTG). Expressionof PRTS in eukaryotic cells is achieved by infecting insect or mammaliancell lines with recombinant Autographica californica nuclearpolyhedrosis virus (AcMNPV), commonly known as baculovirus. Thenonessential polyhedrin gene of baculovirus is replaced with cDNAencoding PRTS by either homologous recombination or bacterial-mediatedtransposition involving transfer plasmid intermediates. Viralinfectivity is maintained and the strong polyhedrin promoter drives highlevels of cDNA transcription. Recombinant baculovirus is used to infectSpodoptera frugiperda (Sf9) insect cells in most cases, or humanhepatocytes, in some cases. Infection of the latter requires additionalgenetic modifications to baculovirus. (See Engelhard, E. K et al. (1994)Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum.Gene Ther. 7:1937-1945.)

[0323] In most expression systems, PRTS is synthesized as a fusionprotein with, e.g., glutathione S-transferase (GST) or a peptide epitopetag, such as FLAG or 6-His, permitting rapid, single-step,affinity-based purification of recombinant fusion protein from crudecell lysates. GST, a 26-kilodalton enzyme from Schistosoma japonicum,enables the purification of fusion proteins on immobilized glutathioneunder conditions that maintain protein activity and antigenicity(Amersham Pharmacia Biotech). Following purification, the GST moiety canbe proteolytically cleaved from PRTS at specifically engineered sites.FLAG, an 8-amino acid peptide, enables immunoaffinity purification usingcommercially available monoclonal and polyclonal anti-FLAG antibodies(Eastman Kodak). 6-His, a stretch of six consecutive histidine residues,enables purification on metal-chelate resins (QIAGEN). Methods forprotein expression and purification are discussed in Ausubel (1995,supra, ch. 10 and 16). Purified PRTS obtained by these methods can beused directly in the assays shown in Examples XVI, XVII, XVIII, and XIXwhere applicable.

XIII. Functional Assays

[0324] PRTS function is assessed by expressing the sequences encodingPRTS at physiologically elevated levels in mammalian cell culturesystems. cDNA is subcloned into a mammalian expression vector containinga strong promoter that drives high levels of cDNA expression. Vectors ofchoice include PCMV SPORT (Life Technologies) and PCR3.1 (Invitrogen,Carlsbad Calif.), both of which contain the cyt megalovirus promoter.5-10 μg of recombinant vector are transiently trasfected into a humancell line, for example, an endothelial or hematopoietic cell line, usingeither liposome formulations or electroporation. 1-2 μg of an additionalplasmid containing sequences encoding a marker protein areco-transfected. Expression of a marker protein provides a means todistinguish transfected cells from nontransfected cells and is areliable predictor of cDNA expression from the recombinant vector.Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP;Clontech), CD64, or a CD64-GFP fusion protein. Flow cytometry (FCM), anautomated, laser optics-based technique, is used to identify transfectedcells expressing GFP or CD64-GFP and to evaluate the apoptotic state ofthe cells and other cellular properties. FCM detects and quantifies theuptake of fluorescent molecules that diagnose events preceding orcoincident with cell death. These events include changes in nuclear DNAcontent as measured by staining of DNA with propidium iodide; changes incell size and granularity as measured by forward light scatter and 90degree side light scatter; down-regulation of DNA synthesis as measuredby decrease in bromodeoxynridine uptake; alterations in expression ofcell surface and intracellular proteins as measured by reactivity withspecific antibodies; and alterations in plasma membrane composition asmeasured by the binding of fluorescein-conjugated Annexin V protein tothe cell surface. Methods in flow cytometry are discussed in Ormerod, M.G. (1994) Flow Cytometry, Oxford, New York N.Y.

[0325] The influence of PRTS on gene expression can be assessed usinghighly purified populations of cells transfected with sequences encodingPRTS and either CD64 or CD64-GFP. CD64 and CD64-GEP are expressed on thesurface of transfected cells and bind to conserved regions of humanimmunoglobulin G (IgG). Transfected cells are efficiently-separated fromnontransfected cells using magnetic beads coated with either human IgGor antibody against CD64 (DYNAL, Lake Success N.Y.). mRNA can bepurified from the cells using methods well known by those of skill inthe art. Expression of mRNA encoding PRTS and other genes of interestcan be analyzed by northern analysis or microarray techniques.

XIV. Production of PRTS Specific Antibodies

[0326] PRTS substantially purified using polyacrylamide gelelectrophoresis (PAGE; see, e.g., Harrington, M. G. (1990) MethodsEnzymol. 182:488-495), or other purification techniques, is used toimmnunize rabbits and to produce antibodies using standard protocols.

[0327] Alternatively, the PRTS amino acid sequence is analyzed usingLASERGENE software (DNASTAR) to determine regions of highimmunogenicity, and a corresponding oligopeptide is synthesized and usedto raise antibodies by means kmown to those of skill in the art. Methodsfor selection of appropriate epitopes, such as those near the C-terminusor in hydrophilic regions are well described in the art. (See, e.g.,Ausubel, 1995, supra, ch 11.)

[0328] Typically, oligopeptides of about 15 residues in length aresynthesized using an ABI 431A peptide synthesizer (Applied Biosystems)using FMOC chemistry and coupled to OM (Sigma-Aldrich, St. Louis Mo.) byreaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) toincrease immunogenicity. (See, e.g., Ausubel, 1995, supra.) Rabbits areimmunized with the oligopeptide-KLH complex in complete Freund'sadjuvant Resulting antisera are tested for antipeptide and anti-PRTSactivity by, for example, binding the peptide or PRTS to a substrate,blocking with 1% BSA, reacting with rabbit antisera, washing, andreacting with radio-iodinated goat anti-rabbit IgG.

XV. Purification of Naturally Occurring PRTS Using Specific Antibodies

[0329] Naturally occurring or recombinant PRTS is substantially purifiedby immuno-affinity chromatography using antibodies specific for PRTS. Animmunoaffinity column is constructed by covalently coupling anti-PRTSantibody to an activated chromatographic resin, such as CNBr-activatedSEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the resin isblocked and washed according to the manufacturer's instructions.

[0330] Media containing PRTS are passed over the immunoaffinity column,and the column is washed under conditions that allow the preferentialabsorbance of PRTS (e.g., high ionic strength buffers in the presence ofdetergent). The column is eluted under conditions that disruptantibody/PRTS binding (e.g., a buffer of pH 2 to pH 3, or a highconcentration of a chaotrope, such as urea or thiocyanate ion), and PRTSis collected.

XVI. Identification of Molecules Which Interact with PRTS

[0331] PRTS, or biologically active fragments thereof, are labeled with¹²⁵I Bolton-Hunter reagent. (See, e.g., Bolton A. E. and W. M. Hunter(1973) Biochem. J. 133:529-539.) Candidate molecules previously arrayedin the wells of a multi-well plate are incubated with the labeled PRTS,washed, and any wells with labeled PRTS complex are assayed. Dataobtained using different concentrations of PRTS are used to calculatevalues for the number, affinity, and association of PRTS with thecandidate molecules.

[0332] Alternatively, molecules interacting with PRTS are analyzed usingthe yeast two-hybrid system as described in Fields, S. and O. Song(1989) Nature 340:245-246, or using commercially available kits based onthe two-hybrid system, such as the MATCHMAKER system (Clontech).

[0333] PRTS may also be used in the PATHCALLING process (CuraGen Corp.,New Haven Conn.) which employs the yeast two-hybrid system in ahigh-throughput manner to determine all interactions between theproteins encoded by two large libraries of genes (Nandabalan, K. et al.(2000) U.S. Pat. No. 6,057,101).

XVII. Demonstration of PRTS Activity

[0334] Protease activity is measured by the hydrolysis of appropriatesynthetic peptide substrates conjugated with various chromogenicmolecules in which the degree of hydrolysis is quantified byspectrophotometric (or fluorometric) absorption of the releasedchromophore (Beynon, R. J. and J. S. Bond (1994) Proteolytic Enzymes: APractical Approach, Oxford University Press, New York N.Y., pp.25-55).Peptide substrates are designed according to the category of proteaseactivity as endopeptidase (serine, cysteine, aspartic proteases, ormetalloproteases), aminopeptidase (leucine aminopeptidase), orcarboxypeptidase (carboxeypeptidases A and B, procollagen C-proteinase).Commonly used chromogens are 2-naphthylamine, 4-nitroaniline, andfurylacrylic acid. Assays are performed at ambient temperature andcontain an aliquot of the enzyme and the appropriate substrate in asuitable buffer. Reactions are carried out in an optical cuvette, andthe increase/decrease in absorbance of the chromogen released duringhydrolysis of the peptide substrate is measured. The change inabsorbance is proportional to the enzyme activity in the assay.

[0335] An alternate assay for ubiquitin hydrolase activity measures thehydrolysis of a ubiquitin precursor. The assay is performed at ambienttemperature and contains an aliquot of PRTS and the appropriatesubstrate in a suitable buffer. Chemically synthesized humanubiquitin-valine may be used as substrate. Cleavage of the C-terminalvaline residue from the substrate is monitored by capillaryelectrophoresis (Franklin, K. et al. (1997) Anal. Biochem. 247:305-309).

[0336] In the alternative, an assay for protease activity takesadvantage of fluorescence resonance energy transfer (FRET) that occurswhen one donor and one acceptor fluorophore with an appropriate spectraloverlap are in close proximity. A flexible peptide linker containing acleavage site specific for PRTS is fused between a red-shifted variant(RSGFP4) and a blue variant (BFP5) of Green Fluorescent Protein. Thisfusion protein has spectral properties that suggest energy transfer isoccurring from BFP5 to RSGFP4. When the fusion protein is incubated withPRTS, the substrate is cleaved, and the two fluorescent proteinsdissociate. This is accompanied by a marked decrease in energy transferwhich is quantified by comparing the emission spectra before and afterthe addition of PRTS (Mitra, R. D. et al. (1996) Gene 173:13-17). Thisassay can also be performed in living cells. In this case thefluorescent substrate protein is expressed constitutively in cells andPRTS is introduced on an inducible vector so that FRET can be monitoredin the presence and absence of PRTS (Sagot, I. et al. (1999) FEBS Lett.447:53-57).

XVIII. Identification of PRTS Substrates

[0337] Phage display libraries can be used to identify optimal substratesequences for PRTS. A random hexamer followed by a linker and a knownantibody epitope is cloned as an N-terminal extension of gene III in afilamentous phage library. Gene III codes for a coat protein, and theepitope will be displayed on the surface of each phage particle. Thelibrary is incubated with PRTS under proteolytic conditions so that theepitope will be removed if the hexamer codes for a PRTS cleavage site.An antibody that recognizes the epitope is added along with immobilizedprotein A. Uncleaved phage, which still bear the epitope, are removed bycentrifligation. Phage in the supernatant are then amplified and undergoseveral more rounds of screening. Individual phage clones are thenisolated and sequenced. Reaction kinetics for these peptide substratescan be studied using an assay in Example XVII, and an optimal cleavagesequence can be derived (Ke, S. H. et al. (1997) J. Biol. Chem.272:16603-16609).

[0338] To screen for in vivo PRTS substrates, this method can beexpanded to screen a cDNA expression library displayed on the surface ofphage particles (T7SELECT 10-3 Phage display vector, Novagen, MadisonWis.) or yeast cells (pYD1 yeast display vector kit, Invitrogen,Carlsbad Calif.). In this case, entire cDNAs are fused between Gene IIIand the appropriate epitope.

XIX. Identification of PRTS Inhibitors

[0339] Compounds to be tested are arrayed in the wells of a multi-wellplate in varying concentrations along with an appropriate buffer andsubstrate, as described in the assays in Example XVII. PRTS activity ismeasured for each well and the ability of each compound to inhibit PRTSactivity can be determined, as well as the dose-response kinetics. Thisassay could also be used to identify molecules which enhance PRTSactivity.

[0340] In the alternative, phage display libraries can be used to screenfor peptide PRTS inhibitors. Candidates are found among peptides whichbind tightly to a protease. In this case, multi-well plate wells arecoated with PRTS and incubated with a random peptide phage displaylibrary or a cyclic peptide library (Koivunen, E. et al. (1999) Nat.Biotechnol 17:768-774). Unbound phage are washed away and selected phageamplified and rescreened for several more rounds. Candidates are testedfor PRTS inhibitory activity using an assay described in Example XVI.

[0341] Various modifications and variations of the described methods andsystems of the invention will be apparent to those skilled in the artwithout departing from the scope and spirit of the invention. Althoughthe invention has been described in connection with certain embodiments,it should be understood that the invention as claimed should not beunduly limited to such specific embodiments. Indeed, variousmodifications of the described modes for carrying out the inventionwhich are obvious to those skilled in molecular biology or relatedfields are intended to be within the scope of the following claims.TABLE 1 Incyte Poly- Incyte Incyte Polypeptide Polypeptide nucleotidePolynucleotide Project ID SEQ ID NO: ID SEQ ID NO: ID  6930294 1 6930294CD1 18  6930294CB1  7473018 2  7473018CD1 19  7473018CB1 7479221 3  7479221CD1 20  7479221CB1  2923874 4  2923874CD1 21 2923874CB1 55122335 5 55122335CD1 22 55122335CB1  7473550 6  7473550CD123  7473550CB1  7478108 7  7478108CD1 24  7478108CB1  7482021 8 7482021CD1 25  7482021CB1  7482145 9  7482145CD1 26  7482145CB155022586 10 55022586CD1 27 55022586CB1  3238072 11  3238072CD1 28 3238072CB1  7482034 12  7482034CD1 29  7482034CB1  7474351 13 7474351CD1 30  7474351CB1  2232483 14  2232483CD1 31  2232483CB1 7481712 15  7481712CD1 32  7481712CB1  8213480 16  8213480CD1 33 8213480CB1  7478405 17  7478405CD1 34  7478405CB1

[0342] TABLE 2 Incyte Polypeptide Polypeptide GenBank ID ProbabilityGenBank Homolog SEQ ID NO: ID NO: score 1  6930294CD1 g190418 4.50E−169[Homo sapiens] preprocathepsin L precursor (Joseph, L. J. et al. (1988)J. Clin. Invest. 81: 1621-1629) 2  7473018CD1 g5669607 2.20E−25 [Equuscaballus] caspase-1 Wardlow, S. et al. (1999) Nucleotide sequence ofequine caspase-1 cDNA. DNA Seq. 10: 133-137. 3  7479221CD1 g65731633.80E−298 [Rattus norvegicus] ubiquitin specific processing proteaseLin, H. et al. (2000) Divergent N-terminal sequences target an inducibletestis deubiquitinating enzyme to distinct subcellular structures. Mol.Cell Biol. 20: 6568-6578. 4  2923874CD1 g306706 5.20E−207 [Homo sapiens]dipeptidyl aminopeptidase like protein (Yokotani, N. et al. (1993) Hum.Mol. Genet. 2: 1037-1039) 5 55122335CD1 g10800858 0 [fl] [Homo sapiens]aminopeptidase B 6  7473550CD1 g2981641 6.40E−201 [Xenopus laevis]ovochymase/ovotryptase polyprotease (Lindsay, L. L. et al. (1999) Proc.Natl. Acad. Sci. U.S.A. 96: 11253-11258) 7  7478108CD1 g544755 3.00E−166[Oryctolagus cuniculus] aminopeptidase N, APN {type II membrane protein}(Santos, A. N. et al. (2000) Cell. Immunol. 201: 22-32) 8  7482021CD1g6573165 5.60E−210 [Rattus norvegicus] testis ubiquitin specificprocessing protease Lin, H. et al. (2000) Divergent N-terminal sequencestarget an inducible testis deubiquitinating enzyme to distinctsubcellular structures. Mol. Cell Biol. 20: 6568-6578. 9  7482145CD1g6683668 1.70E−114 [Carassius auratus] alpha 4 subunit of 20S proteasome(Tokumoto, M. et al. (2000) Eur. J. Biochem. 267: 97-103) 10 55022586CD1g14279329 0 [fl] [Homo sapiens] ubiquitin specific protease 11 3238072CD1 g5410230 5.50E−56 [Homo sapiens] ubiquitin-specific protease3 (Sloper-Mould, K. E. et al. (1999) J. Biol. Chem. 274: 26878-26884) 12 7482034CD1 g4545092 3.80E−60 [Sus scrofa] proteasome subunit LMP7(Chun, T. et al. (1999) Immunogenetics 49: 72-77) 13  7474351CD1g4512604 5.00E−48 [Canis sp.] mastin precursor (Rice, K. D. et al.(1998) Curr. Pharm. Des. 4: 381-396) 14  2232483CD1 g6465985 1.40E−229[[Homo sapiens] quiescent cell proline dipeptidase (Underwood, R. (1999)J. Biol. Chem. 274: 34053-34058) 15  7481712CD1 g13528975 1.00E−122 [fl][Homo sapiens] (BC005279) carboxypeptidase A1 (pancreatic) 16 8213480CD1 g13157560 0 [3′ incom] [Homo sapiens] dJ964F7.1 (noveldisintegrin and reprolysin metalloproteinase family protein) 17 7478405CD1 g5923786 9.10E−164 [Homo sapiens] zinc metalloproteaseADAMTS6 (Hurskainen, T. L. et al. (1999) J. Biol. Chem. 274:25555-25563)

[0343] TABLE 3 SEQ Incyte Amino Potential Potential Analytical IDPolypeptide Acid Phosphorylation Glycosylation Signature Sequences,Methods and NO: ID Residues Sites Sites Domains and Motifs Databases 1 6930294CD1 333 S160 S210 T155 N221 Papain family cysteine proteaseHMMER_PFAM T84 Y112 Peptidase_C1: A114-T332 Eukaryotic thiol proteaseactive site BLIMPS_BLOCKS BL00139: Q132-F141, N175-M183, D275- S284,Y295-Y311 PAPAIN CYSTEINE PROTEASE BLIMPS_PRINTS PR00705: Q132-L147,H276-E286, Y295-R301 EUKARYOTIC THIOL PROTEASES CYSTEINE BLAST_DOMODM00081|P07711|19-332: L19-V333 DM00081|P25975|20-333: D22-V333DM00081|P06797|19-332: F21-V333 DM00081|P15242|20-332: T20-V333 PROTEASEPRECURSOR SIGNAL CYSTEINE BLAST_PRODOM PROTEINASE HYDROLASE THIOLZYMOGEN CATHEPSIN GLYCOPROTEIN PD000158: S117-S218, C169-P331 PD000247:K31-E113 Eukaryotic thiol (cysteine) protease MOTIFS active sites:Thiol_Protease_Cys: Q132-A143 Thiol_Protease_His: L274-S284 Eukaryoticthiol (cysteine) protease PROFILESCAN active sitesthiol_protease_cys.prf: E113-E163 thiol_protease_his.prf: Q257-G307signal_peptide: M1-T20 HMMER signal_cleavage: M1-A17 SPSCAN 2 7473018CD1 90 S36 T49 T62 N47 CASPASE RECRUITMENT DOMAIN CARD:HMMER_PFAM INTERLEUKIN-1 BETA CONVERTING ENZYME BLAST_DOMO FAMILYHISTIDINE DM07463|P29466|1-122: M1-L89 DM07463|P29452|1-121: M1-L89signal_cleavage: M1-S36 SPSCAN 3  7479221CD1 605 S14 S142 S152 N548 N574Ubiquitin carboxyl-terminal hydrolase HMMER_PFAM S158 S190 S207 family 2signatures S329 S335 S382 UCH-1: A267-R298 S42 S490 S506 UCH-2:N537-L598 S70 T12 T137 Ubiquitin carboxyl-terminal hydrolaseBLIMPS_BLOCKS T175 T227 T235 family 2 signature T239 T377 T424 BL00972:G268-L285, Y353-L362, I411- T433 T454 T463 C425, V540-S564, T567-T588T512 T572 T7 T99 UBIQUITIN CARBOXYL-TERMINAL BLAST_DOMO Y17 Y22HYDROLASES FAMILY 2 DM00659|P40818|782-1103: L272-L594DM00659|P35123|139-432: L272-I445 DM00659|P35125|220-508: L272-L455DM00659|P32571|566-873: N271-F531 PROTEASE UBIQUITIN HYDROLASE ENZYMEBLAST_PRODOM UBIQUITINSPECIFIC CARBOXYLTERMINAL DEUBIQUITINATINGTHIOLESTERASE PROCESSING CONJUGATION PD000590: M258-S432 PD017412:F435-E534 Ubiquitin carboxyl-terminal hydrolase MOTIFS family 2signatures Uch_2_1: G268-Q283 Uch_2_2: Y541-Y558 4  2923874CD1 743 S157S163 S260 N204 N289 Dipeptidyl peptidase IV active site HMMER_PFAM S304S355 S393 N58 N66 signature S589 S593 S635 N695 N707 DPPIV_N_term:M1-D525 S643 S709 T238 Prolyl oligopeptidase family HMMER_PFAM T294 T361T382 Peptidase_S9: F527-I603 T423 T524 T71 Prolyl endopeptidase familyBLIMPS_BLOCKS Y508 BL00708B: D573-I603 Dipeptidyl peptidase IVBLIMPS_PFAM PF00930: H77-Y98, R159-P209, Y221-Y247, E265-E297,L365-I375, E420-N465, P499- I536, D537-K579, F615-P642, N665-L685 PROLYLENDOPEPTIDASE FAMILY SERINE BLAST_DOMO DM02461|P42659|335-862: P222-E743DM02461|P27487|192-765: E167-C727 DM02461|I38593|190-759: I169-C727DM02461|P33894|340-930: I169-V694, Y221- H715 DPP IV HYDROLASE PROTEASESERINE BLAST_PRODOM PEPTIDASE DIPEPTIDASE TRANSMEMBRANE GLYCOPROTEINPD003086: Y20-P493, S275-T524 PD003048: I603-C727 5 55122335CD1 650 S208S318 S359 Peptidase family M1 HMMER_PFAM S496 T141 T368 Peptidase_M1:R32-G417 T374 T386 T408 MEMBRANE ALANYL DIPEPTIDASE FAMILY BLIMPS_PRINTST412 SIGNATURE PR00756: R176-Y191, F220-I235, F295-L305, V322-T337,W341-Y353 Neutral Zn metalloprotease, Zn-binding BLIMPS_BLOCKS regionBL00142: V322-F332 do HYDROLASE; LEUKOTRIENE; A-4; ZINC; BLAST_DOMODM08707|P19602|7-609: H38-H634 DM08707|Q10740|58-670: W152-H634, A26-S82 do ZINC; AMINOPEPTIDASE; BLAST_DOMO METALLOPEPTIDASE; NEUTRAL;DM00700|I55441|163-916: A159-P489 DM00700|S47274|1-784: G160-P489AMINOPEPTIDASE B EC 3.4.11.6 ARGINYL BLAST_PRODOM ARGININE CYTOSOL IVAPB HYDROLASE ZINC METALLOPROTEASE PD143187: A2-F165 HYDROLASE ZINCMETALLOPROTEASE BLAST_PRODOM LEUKOTRIENE A4 LTA4 A4 MULTIFUNCTIONALENZYME BIOSYNTHESIS PD008823: Y533-Q643 AMINOPEPTIDASE HYDROLASEBLAST_PRODOM METALLOPROTEASE ZINC N GLYCOPROTEIN TRANSMEMBRANESIGNALANCHOR MEMBRANE PD001134: R248-D518 Neutral Zn metalloprotease,Zn-binding MOTIFS region Zinc_Protease: V322-W331 6  7473550CD1 932 S319S326 S353 N324 N424 Trypsin family active site HMMER_PFAM S387 S394 S426N500 N52 trypsin: I47-I291, I568-I809 S49 S565 S665 N706 N99 CUB domainHMMER_PFAM S708 S840 S906 CUB: S310-V400, C412-F521 S91 S93 T103 Serineproteases, trypsin family active BLIMPS_BLOCKS T126 T297 T337 siteBL00134: C593-C609, D759-G782, T454 T545 T744 P796-I809 T853 T910 Y735Kringle domain proteins BLIMPS_BLOCKS BL00021B: C72-L89 CHYMOTRYPSINSERINE PROTEASE ACTIVE BLIMPS_PRINTS SITE PR00722: G594-C609, S653-L667TRYPSIN BLAST_DOMO DM00018|P23578|42-289: R567-I813, R46- P268DM00018|A57014|45-284: I568-I813, N52- P268 DM00018|P48038|39-286:R567-I813, P259-P268 DM00018|P03952|392-624: G570- K812, N52-Q293PROTEASE SERINE PRECURSOR SIGNAL BLAST_PRODOM HYDROLASE ZYMOGENGLYCOPROTEIN FAMILY MULTIGENE FACTOR PD000046: G589-I809, W50-I291Serine proteases, trypsin family, active MOTIFS sites Trypsin_His:V83-C88, L604-C609 Trypsin_Ser: D231-V242 Serine proteases, trypsinfamily, active PROFILESCAN sites trypsin_his.prf: L64-Q115, L585- T634trypsin_ser.prf: L216-G264, I743-Q792 signal_peptide: M1-G22 HMMERsignal_cleavage: M1-G22 SPSCAN 7  7478108CD1 990 S200 S237 S282 N132N168 Peptidase family M1 HMMER_PFAM S353 S442 S536 N261 N288Peptidase_M1: L98-G506 S54 S631 S641 N319 N338 MEMBRANE ALANYLDIPEPTIDASE FAMILY BLIMPS_PRINTS S643 S74 S835 N346 N360 SIGNATUREPR00756: W431-Y443, R245-F260, S917 S979 T128 N582 N600 F297-I312,F376-L386, V412-T427 T134 T141 T321 N607 N619 Neutral Znmetalloprotease, Zn-binding BLIMPS_BLOCKS T403 T562 T605 N653 N848region BL00142: V412-F422 T69 T706 T850 N887 do ZINC; AMINOPEPTIDASE;BLAST_DOMO T885 T967 T990 METALLOPEPTIDASE; NEUTRAL;DM00700|P15541|67-903: W93-I932 DM00700|P15145|66-901: W93-I929DM00700|P15684|70-903: W93-I929 DM00700|P15144|70-904: W93-I932AMINOPEPTIDASE HYDROLASE BLAST_PRODOM METALLOPROTEASE ZINC GLYCOPROTEINTRANSMEMBRANE SIGNALANCHOR PD001134: Q95-T585 PD002091: V587-Y874Neutral Zn metalloprotease, Zn-binding MOTIFS region Zinc_Protease:V412-W421 signal_peptide: M1-A31 HMMER signal_cleavage: M1-A34 SPSCANtransmembrane domain: A16-Y37 HMMER 8  7482021CD1 396 S120 S126 S173N339 N365 Ubiquitin carboxyl-terminal hydrolase HMMER_PFAM S281 S297T168 family 2 UCH-1: A58-R89 T215 T224 T245 UCH-2: N328-L389 T254 T303T363 Ubiquitin carboxyl-terminal hydrolase BLIMPS_BLOCKS T8 family 2BL00972: G59-L76, Y144-L153, I202-C216, V331-S355, T358-T379 UBIQUITINCARBOXYL-TERMINAL HYDRO- BLAST_DOMO LASE FAMILY 2DM00659|P40818|782-1103: L63- L385 DM00659|P35123|139-432: L63-I236DM00659|P35125|220-508: L63-L246 DM00659|P32571|566-873: N62-F322PROTEASE UBIQUITINSPECIFIC HYDROLASE BLAST_PRODOM ENZYME C-TERMINALDEUBIQUITINATING THIOLESTERASE PROCESSING CONJUGATION PD000590: S51-S223PD017412: F226-E325 Ubiquitin carboxyl-terminal hydrolase MOTIFS family2 Uch_2_1: G59-Q74 Uch_2_2: Y332-Y349 signal_cleavage: M1-N46 SPSCAN 9 7482145CD1 250 S166 S185 S223 N177 Proteasome A-type and B-typeHMMER_PFAM S246 S3 S32 S95 proteasome: T33-T179 T115 T169 T232Proteasome A-type subunits signature BLIMPS_BLOCKS T60 T99 Y178 BL00388:Y5-K50, K63-V104, Q118-D139, L146 -K176 Proteasome A-type and B-typePF00227: BLIMPS_PFAM F12-Y23 Proteasome A-type subunits signaturePROFILESCAN proteasome.prf: M1-V47 PROTEASOME A-TYPE SUBUNITS BLAST_DOMODM00341|P48004|1-226: Y5-S223 DM00341|S23451|3-222: S3-M221DM00341|P22769|3-222: S3-M221 DM00341|P34120|4-220: S3-L219 PROTEASOMEHYDROLASE PROTEASE BLAST_PRODOM SUBUNIT MULTICATALYTIC COMPLEXENDOPEPTIDASE MACROPAIN COMPONENT PROTEIN PD000280: S32-K191 ProteasomeA-type subunits signature MOTIFS Proteasome_A: Y5-A27 10 55022586CD11045 S47, S76, S109, N282, PROBABLE UBIQUITIN CARBOXYLTERMINALBLAST-PRODOM S113, T130, N310, HYDROLASE K02C4.3 EC 3.1.2.15 T134, S137,N373, THIOLESTERASE UBIQUITINSPECIFIC S205, T207, N639, PROCESSINGPROTEASE DEUBIQUITINATING S228, S248, N711, N822 ENZYME HYPOTHETICALPROTEIN CONJU- T260, S279, GATION THIOL: PD138085: F540-S720, Y316-S720S347, S368, Ubiquitin carboxyl-terminal hydrolase BLIMPS-BLOCKS S453,S479, family 2 proteins: BL00972: G163-L180, T484, T489, E251-T260,P583-N607, R610-R631 S494, S503, Ubiquitin carboxyl-terminal hydrolaseHMMER-PFAM S504, S517, family: UCH-1: V162-Y193, UCH-2: R580- S520,T532, N649 T534, S550, Ubiquitin carboxyl-terminal hydrolase MOTIFSS620, S624, family: Uch_2_2: Y584-Y601 S625, T662, S668, S700, S713,T719, T753, S760, S787, T813, S824, S867, T872, Y888, S930, T934, S939,S964, T1016, S1021, T1043 11  3238072CD1 622 S142 S166 S221 N243 N424Ubiquitin carboxyl-terminal hydrolase HMMER_PFAM S238 S245 S250 N566family 2 S285 S304 S363 UCH-1: T187-L218 S455 S460 S493 UCH-2: E528-Q590S523 S56 S574 Ubiquitin carboxyl-terminal hydrolase BLIMPS_BLOCKS S611S96 T149 family 2 T165 T173 T354 BL00972: G188-L205, Y329-L338; V375-T355 T368 T438 C389, Y532-N556, G559-K580 T50 T589 T88 UBIQUITINCARBOXYL-TERMINAL BLAST_DOMO HYDROLASES FAMILY 2DM00659|P40818|782-1103: L291-G542, V421-L586, L192-F215DM00659|Q09738|149-388: K306-V421, V421-G542, N191-N217DM00659|S57874|537-787: H288-T426, L192-N217 PROTEASE UBIQUITINSPECIFICHYDROLASE BLAST_PRODOM ENZYME CARBOXYLTERMINAL DEUBIQUITINATINGTHIOLESTERASE PROCESSING CONJUGATION PD000590: N281-T398, A183-T223Ubiquitin carboxyl-terminal hydrolase MOTIFS family 2 Uch_2_1: G188-Q203Uch_2_2: Y532-Y550 12  7482034CD1 345 S125 S201 S242 Proteasome A-typeand B-type subunit HMMER_PFAM S276 S282 S37 proteasome: T96-R238 T142T258 T332 Proteasome B-type subunit BLIMPS_BLOCKS Y207 BL00854:A99-A144, F206-D234, A257-G266 PROTEASOME COMPONENT SIGNATUREBLIMPS_PRINTS PR00141: H259-L270, F102-G117, G223-D234, D234-E245PROTEASOME B-TYPE SUBUNITS BLAST_DOMO DM00618|P28062|46-260: G77-V281DM00618|P30656|48-264: P80-W278 DM00618|P28072|5-222: P80-E279DM00618|I49120|1-185: L98-E279 PROTEASOME HYDROLASE PROTEASEBLAST_PRODOM SUBUNIT MULTICATALYTIC COMPLEX ENDOPEPTIDASE MACROPAINCOMPONENT PD000280: T95-E245 Proteasome_B: L98-D145 MOTIFSsignal_peptide: M1-A30 HMMER signal_cleavage: M1-A30 SPSCAN 13 7474351CD1 948 S179 S19 S194 N159 N247 Trypsin family serine proteaseactive HMMER_PFAM S287 S310 S514 N325 N335 site trypsin: A218-I406,V419-Q496, S522 S613 S648 N372 N630 L636-R761 S687 S751 S923 Trypsinfamily serine protease active PROFILESCAN T150 T315 T327 sitetrypsin_ser.prf: R705-G748 T337 T578 T653 CHYMOTRYPSIN SERINE PROTEASEBLIMPS_PRINTS T718 T722 T738 PR00722C: R720-V732 T760 T919 T95 TRYPSINBLAST_DOMO Y467 DM00018|P19236|20-262: E212-Q408, L636- W766DM00018|P21845|31-271: F219-Q408, L643- R761 DM00018|Q02844|29-268:R220-I406, P629-R761 DM00018|P15157|31-270: E215-I406, P629- R761PROTEASE SERINE PRECURSOR SIGNAL BLAST_PRODOM HYDROLASE ZYMOGENGLYCOPROTEIN FAMILY MULTIGENE PD000046: D232-I406 Kringle domainproteins BLIMPS_BLOCKS BL00021: V276-G297, G365-I406 14  2232483CD1 444S291 S305 S402 N289 N330 Prolyl aminopeptidase family BLIMPS_PRINTS S409S60 T121 N337 N380 PR00793C: V158-R172 T212 T314 T75 N50 N86 Prolyloligopeptidase family BLIMPS_PRINTS BL00862D: G160-A180 Prolylendopeptidase family BLIMPS_BLOCKS BL00708B: D137-L167 alpha/betahydrolase fold HMMER_PFAM abhydrolase: A100-A334 do LYSOSOMAL; PRO-X;CARBOXYPEPTIDASE; BLAST_DOMO DM03192|P42785|3-487: A4-T206, V213- F342,D355-E417 DM03192|P34676|1-498: F31-V189, Y210- I377, S354-K426DM03192|P34610|31-480: R39-F342, Q324- R414 DM03192|P34528|84-584:F36-A191, C326- K415, S291-T339 PROTEIN CARBOXYPEPTIDASE LYSOSOMALBLAST_PRODOM PROX SIMILAR HUMAN CHROMOSOME III F23B2.12 PD149833:L243-N337, S360-L416 signal_peptide: M1-A21 HMMER Leucine_Zipper:L128-L149 MOTIFS signal_cleavage: M1-A21 SPSCAN 15  7481712CD1 514 S202S225 S336 N115 N249 Zinc carboxypeptidase Zn binding region HMMER_PFAMS377 T219 T316 N334 N359 Zn_carbOpept: Y217-E497 T494 T504 Y436 N93 Zinccarboxypeptidases, Zn binding BLIMPS_BLOCKS region BL00132: Y217-F257,P265-W278, Y295-K335, P339-K353, P365-H391, N393- L414, T450-G467CARBOXYPEPTIDASE A METALLOPROTEASE BLIMPS_PRINTS FAMILY PR00765:I243-L255, P265-I279, G345-K353, I398-Y411 Zn carboxypeptidases,Zn-binding region PROFILESCAN signatures carboxypept_zn_2.prf: E380-L435ZINC CARBOXYPEPTIDASES, ZINC-BINDING BLAST_DOMO REGION 1DM00683|P15085|112-418: R207-P513 DM00683|P48052|111-416: E206-P513DM00683|A56171|111-416: E206-P513 DM00683|P19222|111-416: E206-P513CARBOXYPEPTIDASE PRECURSOR SIGNAL BLAST_PRODOM HYDROLASE ZINC ZYMOGENPROTEIN D B GP180CARBOXYPEPTIDASE PD001916: Y217- Y411 Zinccarboxypeptidases, Zn binding MOTIFS region signatures Carboxypept_Zn_1:P265-T287 Carboxpept_Zn_2: H401-Y411 16  8213480CD1 787 S162 S389 S450N109 N145 Reprolysin (M12B) family zinc HMMER_PFAM S547 S55 S61 N231N276 metalloprotease Reprolysin: K210-P409 S761 T174 T208 N448Reprolysin family propeptide HMMER_PFAM T258 T264 T302 Pep_M12B_propep:E80-Q198 T605 Y243 Neutral Zn metallopeptidase Zn-binding BLIMPS_BLOCKSregion BL00142: T342-G352 Neutral Zn metallopeptidase Zn-bindingPROFILESCAN region zinc_protease.prf: E323-A376 Disintegrin signatureHMMER_PFAM disintegrin: E426-L501 Disintegrins signature PROFILESCANdisintegrins.prf: G352-D498 Disintegrin signature BL00427: C443-P497BLIMPS_BLOCKS DISINTEGRIN SIGNATURE BLIMPS_PRINTS PR00289: C457-R476,E486-D498 do ZINC; NEUTRAL METALLOPEPTIDASE; BLAST_DOMO ATROLYSIN;DM00368|S60257|204-414: R202-D410 DM00368|Q05910|189-395: R206-D410DM00368|P28891|1-202: E204-P409 do ZINC; REGULATED; EPIDIDYMAL;BLAST_DOMO NEUTRAL; DM00591|S60257|492-628: F487-G608 METALLOPROTEASEPRECURSOR HYDROLASE BLAST_PRODOM SIGNAL ZINC VENOM CELL TRANSMEMBRANEADHESION PD000791: R209-P409 PD000935: L70-M169 CELL ADHESION PLATELETBLOOD BLAST_PRODOM COAGULATION VENOM DISINTEGRIN METALLOPROTEASEPRECURSOR SIGNAL PD000664: E426-Y500 TRANSMEMBRANE METALLOPROTEASEBLAST_PRODOM SIGNAL PRECURSOR GLYCOPROTEIN CELL FERTILIN BETA ADHESIONPD001269: D503-L572 signal_peptide: M1-G27 HMMER Neutral Znmetallopeptidase Zn binding MOTIFS region Zinc_Protease: T342-L351signal_cleavage: M1-G27 SPSCAN 17  7478405CD1 1082 S1021 S1060 S220 N151N190 Reprolysin family propeptide HMMER_PFAM S279 S289 S396 N313 N745Pep_M12B_propep: E111-R222 S631 S698 S795 N838 N909 Reprolysin (M12B)family zinc HMMER_PFAM S89 S914 S953 metalloprotease zinc binding regionT1025 T135 T171 Reprolysin: V295-P498 T206 T390 T421 Neutral Znmetalloprotease, Zn-binding BLIMPS_BLOCKS T65 T674 T747 region T817 T871Y270 BL00142: T433-G443 do ZINC; METALLOPEPTIDASE; NEUTRAL; BLAST_DOMOATROLYSIN; DM00368|Q05910|189-395: V295-P498 DM00368|S48169|140-343:V295-P498 DM00368|P34179|1-202: V295-P498 DM00368|P15167|190-392:V295-P498 METALLOPROTEASE PRECURSOR HYDROLASE BLAST_PRODOM SIGNAL ZINCVENOM CELL PROTEIN TRANSMEMBRANE ADHESION PD000791: V295- P498 PROTEINPROCOLLAGEN THROMBOSPONDIN BLAST_PRODOM MOTIFS NPROTEINASE A DISINTEGRINMETALLOPROTEASE WITH ADAMTS1 PD011654: V676-C748 PD013511: K509-V578Thrombospondin type 1 domain HMMER_PFAM tsp_1: S593-C643, R873-C931,G938-C991, P993-C1048 signal_cleavage: M1-S16 SPSCAN

[0344] TABLE 4 Polynucleotide Incyte Sequence Selected SEQ ID NO:Polynucleotide ID Length Fragment(s) Sequence Fragments 5′ Position 3′Position 18  6930294CB1 1187 1091-1187 6917460H1 (PLACFER06) 217 927GBI: g7939149_000003.2.edit 814 1187 5118206F6 (SMCBUNT01) 1 506 19 7473018CB1 461 g1365166 1 461 GNN.g7651935_000011_002 21 293 20 7479221CB1 1884  591-773 6981403H1 (BRAIFER05) 1144 1758 6618712H1(BRAITDR02) 375 1024 7269080H1 (OVARDIJ01) 977 1598 1241675R6(LUNGNOT03) 1311 1884 GBI.g7960351_edit 1 774 21  2923874CB1 2576  1-158 72004319V1 490 1356 71998773V1 1899 2576 7015044F8 (KIDNNOC01) 1591 72004394V1 1276 2171 22 55122335CB1 2000 1792-2000 8268324H1(BLYRTXF01) 1 636 70942077V1 1323 1994 7699471H1 (KIDPTDE01) 578 122671984458V1 1365 2000 71986878V1 1255 1928 55114534J1 677 1275 23 7473550CB1 3522   1-735, FL7473550_g8102345_000 98 3522 2323-2786,001_g2981641 1799-1940,  930-1690, 2913-3522, 2140-2240GNN.g7076703_000017_002 1 312 24  7478108CB1 3277  709-831, 1-179,6926255F8 (PLACFER06) 2274 2745 3244-3277, 1957-2354 6923595F6(PLACFER06) 1750 2603 55142456J1 1 736 55047371J1 666 1574 55047372J11063 1960 6926255H1 (PLACFER06) 2273 2689 5329258F6 (DRGTNON04) 28013277 GBI.g9256180_000003_000004.edit 2708 3268 25  7482021CB1 1254  1-76 g3016366 77 592 7037834H1 (UTRSTMR02) 180 645 1241675R6(LUNGNOT03) 684 1254 6450560H1 (BRAINOC01) 594 1238 GBI.g7960351_edit_11 147 26  7482145CB1 1120  806-854 70197639V1 1 448 GBI.g8516058_edit111 863 70166902V1 661 1120 27 55022586CB1 4577   1-71, 4368-4577,55057844J1 583 1360  986-2809 71764468V1 2926 3719 6920230F6 (PLACFER06)1290 2134 71760331V1 2434 3064 71760332V1 3836 4381 5763279F8(PROSBPT02) 3899 4577 71188036V1 2409 3037 55022577H1 788 147955057841H1 1 738 71764426V1 3058 3795 2725111T6 (OVARTUT05) 3772 43596920230R6 (PLACFER06) 1589 2433 28  3238072CB1 1952  592-644, 71929643V1659 1445 1820-1952 GBI.g10186764_000001.edit 1762 1952 3238072F6(COLAUCT01) 1127 1830 7725175J1 (THYRDIE01) 1 685 71928050V1 781 1460 29 7482034CB1 1092   1-181, 924-1092 GBI.g9756020_000001.edit 1 174GNN.g8217882_012 55 1092 30  7474351CB1 2847   1-290, 500-284760123248D3 901 1116 GBI: g9798436_CDS_1 1 2847 CpG_WDJ300089003.R1 13231489 3532405H1 (KIDNNOT25) 960 1164 31  2232483CB1 1396   1-25 8094675H1(EYERNOA01) 636 1096 71152873V1 858 1396 1628644F6 (COLNPOT01) 1 48260220501D1 459 833 32  7481712CB1 1853   1-873 6810286H1 (SKIRNOR01) 1547 55051982H1 876 1416 GNN: g5306288_002 90 1853 33  8213480CB1 3344  1-1904, 1479739H1 (CORPNOT02) 1592 1837 2575-2624 7174969F8(BRSTTMC01) 610 1253 6831592H1 (SINTNOR01) 1 334 72142924D1 1922 24277663110F6 (UTRSTME01) 573 1021 2786453T6 (BRSTNOT13) 2780 334455113148H1 2325 3142 6958043R8 (BLADNOR01) 196 644 1252335T6 (LUNGFET03)2663 3341 7659180J1 (OVARNOE02) 1716 2302 34  7478405CB1 3389  563-308672420192D1 865 1338 g6702073 1 561 4018316F8 (BRAXNOT01) 2756 321958005173H1 2118 2793 55123782H1 1340 2115 55141002J1 567 1334 55065490J11300 1997 55123882J1 2095 2747 g1550049 3089 3389 4293359F6 (BRABDIR01)336 816

[0345] TABLE 5 Polynucleotide Incyte SEQ ID NO: Project IDRepresentative Library 18  6930294CB1 CONFNOT03 20  7479221CB1 LUNGNOT0321  2923874CB1 BRAINOT22 22 55122335CB1 KIDEUNE02 24  7478108CB1PLACFER06 25  7482021CB1 LUNGNOT03 26  7482145CB1 COLITUT02 2755022586CB1 PROSTUS23 28  3238072CB1 ESOGTUE01 30  7474351CB1 KIDNNOT2531  2232483CB1 BRSTNOT05 32  7481712CB1 SKIRNOR01 33  8213480CB1UTRSTME01 34  7478405CB1 ENDMUNE01

[0346] TABLE 6 Library Vector Library Description BRAINOT22 pINCYLibrary was constructed using RNA isolated from right temporal lobetissue removed from a 45-year-old Black male during a brain lobectomy.Pathology for the associated tumor tissue indicated dysembryoplasticneuroepithelial tumor of the right temporal lobe. The right temporalregion dura was consistent with calcifying pseudotumor of the neuraxis.Family history included obesity, benign hypertension, cirrhosis of theliver, obesity, hyperlipidemia, cerebrovascular disease, and type IIdiabetes. BRSTNOT05 PSPORT1 Library was constructed using RNA isolatedfrom breast tissue removed from a 58- year-old Caucasian female during aunilateral extended simple mastectomy. Pathology for the associatedtumor tissue indicated multicentric invasive grade 4 lobular carcinoma.Patient history included skin cancer, rheumatic heart disease,osteoarthritis, and tuberculosis. Family history includedcerebrovascular and cardiovascular disease, breast and prostate cancer,and type I diabetes. COLITUT02 pINCY Library was constructed using RNAisolated from colon tumor tissue of the ileocecal valve removed from a29-year-old female. Pathology indicated malignant lymphoma, small cell,non-cleaved (Burkitt's lymphoma, B-cell phenotype), forming a polypoidmass in the region of the ileocecal valve, associated withintussusception and obstruction clinically. The liver and multiple (3 of12) ileocecal region lymph nodes were also involved by lymphoma.CONFNOT03 pINCY Library was constructed using RNA isolated frommesenteric fat tissue removed from a 71-year-old Caucasian male during apartial colectomy and permanent colostomy. Pathology indicatedmesenteric fat tissue associated with diverticulosis and diverticulitiswith abscess formation. Approximately 50 diverticula were noted, one ofwhich was perforated and associated with abscess formation in adjacentmesenteric fat. The patient presented with atrialfibrillation. Patienthistory included viral hepatitis, a hemangioma, and diverticulitis ofcolon. Family history included extrinsic asthma, atheroscleroticcoronaryartery disease, and myocardial infarction. ENDMUNE01 pINCY This 5′biased random primed library was constructed using RNA isolated fromuntreated umbilical artery endothelial cell tissue removed from aCaucasian male (Clonetics) newborn. ESOGTUE01 pINCY This 5′ biasedrandom primed library was constructed using RNA isolated from esophagealtumor tissue removed from a 61-year-old Caucasian male during a partialesophagectomy, proximal gastrectomy, pyloromyotomy, and regional lymphnode excision. Pathology indicated an invasive grade 3 adenocarcinoma inthe esophagus, extending distally to involve the gastroesophagealjunction. The tumor extended through the muscularis to involveperiesophageal and perigastric soft tissues. One perigastric and twoperiesophageal lymph nodes were positive for tumor. There were multipleperigastric and periesophageal tumor implants. The patient presentedwith deficiency anemia and myelodysplasia. Patient history includedhyperlipidemia, and tobacco and alcohol abuse in remission. Previoussurgeries included adenotonsillectomy, rhinoplasty, vasectomy, andhemorrhoidectomy. A previous bone marrow aspiration found the marrow tobe hypercellular for age and had a cellularity-to-fat ratio of 95:5. Themarrow was focally densely fibrotic. Granulocytic precursors wereslightly increased with normal maturation. The estimate of blast cellswas greater than 5%. Megakaryocytes were increased and appeared atypicalin clusters. Storage cells and granulomata were absent. Patientmedications included Epoetin, Danocrine, Berocca Plus tablets, Selenium,vitamin B6 phosphate, vitamins E & C, and beta carotene. Family historyincluded alcohol abuse, atherosclerotic coronary artery disease, type IIdiabetes, chronic liver disease, and primary cardiomyopathy in thefather; and benign hypertension and cerebrovascular disease in themother. KIDEUNE02 pINCY This 5′ biased random primed library wasconstructed using RNA isolated from an untreated transformed embryonalcell line (293-EBNA) derived from kidney epithelial tissue (Invitrogen).The cells were transformed with adenovirus 5 DNA. KIDNNOT25 pINCYLibrary was constructed using RNA isolated from kidney tissue removedfrom the left lower kidney pole of a 42-year-old Caucasian female duringnephroureterectomy. Pathology indicated slight hydronephrosis andnephrolithiasis. Patient history included calculus of the kidney.LUNGNOT03 PSPORT1 Library was constructed using RNA isolated from lungtissue of a 79-year-old Caucasian male. Pathology for the associatedtumor tissue indicated grade 4 carcinoma. Patient history included abenign prostate neoplasm and atherosclerosis. PLACFER06 pINCY Thisrandom primed library was constructed using RNA isolated from placentaltissue removed from a Caucasian fetus who died after 16 weeks' gestationfrom fetal demise and hydrocephalus. Patient history included umbilicalcord wrapped around the head (3 times) and the shoulders (1 time).Serology was positive for anti-CMV. Family history included multiplepregnancies and live births, and an abortion. PROSTUS23 pINCY Thissubtracted prostate tumor library was constructed using 10 millionclones from a pooled prostate tumor library that was subjected to 2rounds of subtractive hybridization with 10 million clones from a pooledprostate tissue library. The starting library for subtraction wasconstructed by pooling equal numbers of clones from 4 prostate tumorlibraries using mRNA isolated from prostate tumor removed from Caucasianmales at ages 58 (A), 61 (B), 66 (C), and 68 (D) during prostatectomywith lymph node excision. Pathology indicated adenocarcinoma in alldonors. History included elevated PSA, induration and tobacco abuse indonor A; elevated PSA, induration, prostate hyperplasia, renal failure,osteoarthritis, renal artery stenosis, benign HTN, thrombocytopenia,hyperlipidemia, tobacco/alcohol abuse and hepatitis C (carrier) in donorB; elevated PSA, induration, and tobacco abuse in donor C; and elevatedPSA, induration, hypercholesterolemia, and kidney calculus in donor D.The hybridization probe for subtraction was constructed by pooling equalnumbers of cDNA clones from 3 prostate tissue libraries derived fromprostate tissue, prostate epithelial cells, and fibroblasts fromprostate stroma from 3 different donors. Subtractive hybridizationconditions were based on the methodologies of Swaroop et al., NAR 19(1991): 1954 and Bonaldo, et al. Genome Research 6 (1996): 791.SKIRNOR01 PCDNA2.1 This random primed library was constructed using RNAisolated from skin tissue removed from the breast of a 17-year-oldCaucasian female during bilateral reduction mammoplasty. Patient historyincluded breast hypertrophy. Family history included benignhypertension. UTRSTME01 PCDNA2.1 This 5′ biased random primed librarywas constructed using RNA isolated from uterus tissue removed from a49-year-old Caucasian female during vaginal hysterectomy and bilateralsalpingo-oophorectomy. Pathology for the matched tumor tissue indicatedmultiple (6) intramural leiomyomata. The patient presented withexcessive menstruation, deficiency anemia, and dysmenorrhea. Patienthistory included abdominal pregnancy, headache, and chronic obstructiveasthma. Previous surgeries included hemorrhoidectomy, knee ligamentrepair, and intranasal lesion destruction. Patient medications includedAzmacort, Proventil, Trazadone, Zostrix HP, iron, Premarin, and vitaminC. Family history included alcohol abuse, atherosclerotic coronaryartery disease, upper lobe lung cancer, and carotid endarterectomy inthe father; breast fibroadenosis in the sibling(s); and acute myocardialinfarction, liver cancer, acute leukemia, and breast cancer (central) inthe grandparent(s).

[0347] TABLE 7 Parameter Program Description Reference ThresholdABIFACTURA A program that removes vector sequences and AppliedBiosystems, Foster City, CA. masks ambiguous bases in nucleic acidsequences. ABI/ A Fast Data Finder useful in comparing and AppliedBiosystems, Foster City, CA; Mismatch < PARACEL annotating amino acid ornucleic acid sequences. Paracel Inc., Pasadena, CA. 50% FDF ABI Aprogram that assembles nucleic acid sequences. Applied Biosystems,Foster City, CA. AutoAssembler BLAST A Basic Local Alignment Search Tooluseful in Altschul, S. F. et al. (1990) J. Mol. Biol. ESTs: sequencesimilarity search for amino acid and 215: 403-410; Altschul, S. F. etal. (1997) Probability nucleic acid sequences. BLAST includes fiveNucleic Acids Res. 25: 3389-3402. value = 1.0E−8 functions: blastp,blastn, blastx, tblastn, and tblastx. or less Full Length sequences:Probability value = 1.0E−10 or less FASTA A Pearson and Lipman algorithmthat searches for Pearson, W. R. and D. J. Lipman (1988) Proc. ESTs:fasta E similarity between a query sequence and a group of Natl. AcadSci. USA 85: 2444-2448; Pearson, value = sequences of the same type.FASTA comprises as W. R. (1990) Methods Enzymol. 183: 63-98; 1.06E−6least five functions: fasta, tfasta, fastx, tfastx, and and Smith, T. F.and M. S. Waterman (1981) Assembled ssearch. Adv. Appl. Math. 2:482-489. ESTs: fasta Identity = 95% fastx score = 100 or greater orgreater and Match length = 200 bases or greater; fastx E value = 1.0E−8or less Full Length sequences: BLIMPS A BLocks IMProved Searcher thatmatches a Henikoff, S. and J. G. Henikoff (1991) Nucleic Probabilitysequence against those in BLOCKS, PRINTS, Acids Res. 19: 6565-6572;Henikoff, J. G. and value = 1.0E−3 DOMO, PRODOM, and PFAM databases tosearch S. Henikoff (1996) Methods Enzymol. or less for gene families,sequence homology, and structural 266: 88-105; and Attwood, T. K. et al.(1997) J. fingerprint regions. Chem. Inf. Comput. Sci. 37: 417-424.HMMER An algorithm for searching a query sequence against Krogh, A. etal. (1994) J. Mol. Biol. PEAM hits: hidden Markov model (HMM)-baseddatabases of 235: 1501-1531; Sonnhammer, E. L. L. et al. Probabilityprotein family consensus sequences, such as PFAM. (1988) Nucleic AcidsRes. 26: 320-322; value = 1.0E−3 Durbin, R. et al. (1998) Our WorldView, in a or less Nutshell, Cambridge Univ. Press, pp. 1-350. Signalpeptide hits: Score = 0 or greater ProfileScan An algorithm thatsearches for structural and sequence Gribskov, M. et al. (1988) CABIOS4: 61-66; Normalized motifs in protein sequences that match sequencepatterns Gribskov, M. et al. (1989) Methods Enzymol. quality score ≧defined in Prosite. 183: 146-159; Bairoch, A. et al. (1997)GCG-specified Nucleic Acids Res. 25: 217-221. “HIGH” value for thatparticular Prosite motif. Generally, score = 1.4-2.1. Phred Abase-calling algorithm that examines automated Ewing, B. et al. (1998)Genome Res. sequencer traces with high sensitivity and probability. 8:175-185; Ewing, B. and P. Green (1998) Genome Res. 8: 186-194. Phrap APhils Revised Assembly Program including SWAT and Smith, T. F. and M. S.Waterman (1981) Adv. Score = 120 or CrossMatch, programs based onefficient implementation Appl. Math. 2: 482-489; Smith, T.F. and M.S.greater; of the Smith-Waterman algorithm, useful in searching Waterman(1981) J. Mol. Biol. 147: 195-197; Match length = sequence homology andassembling DNA sequences. and Green, P., University of Washington, 56 orgreater Seattle, WA. Consed A graphical tool for viewing and editingPhrap assemblies. Gordon, D. et al. (1998) Genome Res. 8: 195-202.SPScan A weight matrix analysis program that scans protein Nielson, H.et al. (1997) Protein Engineering Score = 3.5 or sequences for thepresence of secretory signal peptides. 10: 1-6; Claverie, J.M. and S.Audic (1997) greater CABIOS 12: 431-439. TMAP A program that uses weightmatrices to delineate Persson, B. and P. Argos (1994) J. Mol. Biol.transmembrane segments on protein sequences and 237: 182-192; Persson,B. and P. Argos (1996) determine orientation. Protein Sci. 5: 363-371.TMHMMER A program that uses a hidden Markov model (HMM) to Sonnhammer,E. L. et al. (1998) Proc. Sixth Intl. delineate transmembrane segmentson protein sequences Conf. on Intelligent Systems for Mol. Biol., anddetermine orientation. Glasgow et al., eds., The Am. Assoc. forArtificial Intelligence Press, Menlo Park, CA, pp. 175-182. Motifs Aprogram that searches amino acid sequences for patterns Bairoch, A. etal. (1997) Nucleic Acids that matched those defined in Prosite. Res. 25:217-221; Wisconsin Package Program Manual, version 9, page M51-59,Genetics Computer Group, Madison, WI.

[0348]

1 34 1 333 PRT Homo sapiens misc_feature Incyte ID No 6930294CD1 1 MetAsn Pro Thr Leu Ile Leu Ala Ala Phe Cys Leu Gly Ile Ala 1 5 10 15 SerAla Thr Leu Thr Phe Asp His Ser Leu Glu Ala Gln Trp Thr 20 25 30 Lys TrpLys Ala Met His Asn Arg Leu Tyr Gly Met Asn Glu Glu 35 40 45 Gly Trp ArgArg Ala Val Trp Glu Lys Asn Met Lys Met Ile Glu 50 55 60 Leu His Asn GlnGlu Tyr Arg Glu Gly Lys His Ser Phe Thr Met 65 70 75 Ala Met Asn Ala PheGly Asp Met Thr Ser Glu Glu Phe Arg Gln 80 85 90 Val Met Asn Gly Phe GlnAsn Arg Lys Pro Arg Lys Gly Lys Val 95 100 105 Phe Gln Glu Pro Leu PheTyr Glu Ala Pro Arg Ser Val Asp Trp 110 115 120 Arg Glu Lys Gly Tyr ValThr Pro Val Lys Asn Gln Gly Gln Cys 125 130 135 Gly Ser Cys Trp Ala PheSer Ala Thr Gly Ala Leu Glu Gly Gln 140 145 150 Met Phe Arg Lys Thr GlyArg Leu Ile Ser Leu Ser Glu Gln Asn 155 160 165 Leu Val Asp Cys Ser GlyPro Gln Gly Asn Glu Gly Cys Asn Gly 170 175 180 Gly Leu Met Asp Tyr AlaPhe Gln Tyr Val Gln Asp Thr Gly Gly 185 190 195 Leu Asp Ser Glu Glu SerTyr Pro Tyr Glu Ala Thr Glu Glu Ser 200 205 210 Cys Arg Tyr Asn Pro LysTyr Ser Ala Ala Asn Asp Thr Gly Phe 215 220 225 Val Asp Ile Pro Ser GlnGlu Lys Asp Leu Ala Lys Ala Val Ala 230 235 240 Thr Val Gly Pro Ile SerVal Ala Ala Gly Ala Ser His Val Ser 245 250 255 Phe Gln Phe Tyr Lys LysGly Ile Tyr Phe Glu Pro Arg Cys Asp 260 265 270 Pro Glu Gly Leu Asp HisAla Met Leu Leu Val Gly Tyr Ser Tyr 275 280 285 Glu Gly Ala Asp Ser AspAsn Asn Lys Tyr Trp Leu Val Lys Asn 290 295 300 Arg Tyr Gly Lys Asn TrpGly Met Asp Gly Tyr Ile Lys Met Ala 305 310 315 Lys Asp Gln Arg Asn AsnCys Gly Ile Ala Thr Ala Ala Ser Tyr 320 325 330 Pro Thr Val 2 90 PRTHomo sapiens misc_feature Incyte ID No 7473018CD1 2 Met Ala Asp Gln LeuLeu Arg Lys Lys Arg Arg Ile Phe Ile His 1 5 10 15 Ser Val Gly Ala GlyThr Ile Asn Ala Leu Leu Asp Cys Leu Leu 20 25 30 Glu Asp Glu Val Ile SerGln Glu Asp Met Asn Lys Val Arg Asp 35 40 45 Glu Asn Asp Thr Val Met AspLys Ala Arg Val Leu Ile Asp Leu 50 55 60 Val Thr Gly Lys Gly Pro Lys SerCys Cys Lys Phe Ile Lys His 65 70 75 Leu Cys Glu Glu Asp Pro Gln Leu AlaSer Lys Met Gly Leu His 80 85 90 3 605 PRT Homo sapiens misc_featureIncyte ID No 7479221CD1 3 Met Ser Gln Leu Ser Ser Thr Leu Lys Arg TyrThr Glu Ser Ala 1 5 10 15 Arg Tyr Thr Asp Ala His Tyr Ala Lys Ser GlyTyr Gly Ala Tyr 20 25 30 Thr Pro Ser Ser Tyr Gly Ala Asn Leu Ala Ala SerLeu Leu Glu 35 40 45 Lys Glu Lys Leu Gly Phe Lys Pro Val Pro Thr Ser SerPhe Leu 50 55 60 Thr Arg Pro Arg Thr Tyr Gly Pro Ser Ser Leu Leu Asp TyrAsp 65 70 75 Arg Gly Arg Pro Leu Leu Arg Pro Asp Ile Thr Gly Gly Gly Lys80 85 90 Arg Ala Glu Ser Gln Thr Arg Gly Thr Glu Arg Pro Leu Gly Ser 95100 105 Gly Leu Ser Gly Gly Ser Gly Phe Pro Tyr Gly Val Thr Asn Asn 110115 120 Cys Leu Ser Tyr Leu Pro Ile Asn Ala Tyr Asp Gln Gly Val Thr 125130 135 Leu Thr Gln Lys Leu Asp Ser Gln Ser Asp Leu Ala Arg Asp Phe 140145 150 Ser Ser Leu Arg Thr Ser Asp Ser Tyr Arg Ile Asp Pro Arg Asn 155160 165 Leu Gly Arg Ser Pro Met Leu Ala Arg Thr Arg Lys Glu Leu Cys 170175 180 Thr Leu Gln Gly Leu Tyr Gln Thr Ala Ser Cys Pro Glu Tyr Leu 185190 195 Val Asp Tyr Leu Glu Asn Tyr Gly Arg Lys Gly Ser Ala Ser Gln 200205 210 Val Pro Ser Gln Ala Pro Pro Ser Arg Val Pro Glu Ile Ile Ser 215220 225 Pro Thr Tyr Arg Pro Ile Gly Arg Tyr Thr Leu Trp Glu Thr Gly 230235 240 Lys Gly Gln Ala Pro Gly Pro Ser Arg Ser Ser Ser Pro Gly Arg 245250 255 Asp Gly Met Asn Ser Lys Ser Ala Gln Gly Leu Ala Gly Leu Arg 260265 270 Asn Leu Gly Asn Thr Cys Phe Met Asn Ser Ile Leu Gln Cys Leu 275280 285 Ser Asn Thr Arg Glu Leu Arg Asp Tyr Cys Leu Gln Arg Leu Tyr 290295 300 Met Arg Asp Leu His His Gly Ser Asn Ala His Thr Ala Leu Val 305310 315 Glu Glu Phe Ala Lys Leu Ile Gln Thr Ile Trp Thr Ser Ser Pro 320325 330 Asn Asp Val Val Ser Pro Ser Glu Phe Lys Thr Gln Ile Gln Arg 335340 345 Tyr Ala Pro Arg Phe Val Gly Tyr Asn Gln Gln Asp Ala Gln Glu 350355 360 Phe Leu Arg Phe Leu Leu Asp Gly Leu His Asn Glu Val Asn Arg 365370 375 Val Thr Leu Arg Pro Lys Ser Asn Pro Glu Asn Leu Asp His Leu 380385 390 Pro Asp Asp Glu Lys Gly Arg Gln Met Trp Arg Lys Tyr Leu Glu 395400 405 Arg Glu Asp Ser Arg Ile Gly Asp Leu Phe Val Gly Gln Leu Lys 410415 420 Ser Ser Leu Thr Cys Thr Asp Cys Gly Tyr Cys Ser Thr Val Phe 425430 435 Asp Pro Phe Trp Asp Leu Ser Leu Pro Ile Ala Lys Arg Gly Tyr 440445 450 Pro Glu Val Thr Leu Met Asp Cys Met Arg Leu Phe Thr Lys Glu 455460 465 Asp Val Leu Asp Gly Asp Glu Lys Pro Thr Cys Cys Arg Cys Arg 470475 480 Gly Arg Lys Arg Cys Ile Lys Lys Phe Ser Ile Gln Arg Phe Pro 485490 495 Lys Ile Leu Val Leu His Leu Lys Arg Phe Ser Glu Ser Arg Ile 500505 510 Arg Thr Ser Lys Leu Thr Thr Phe Val Asn Phe Pro Leu Arg Asp 515520 525 Leu Asp Leu Arg Glu Phe Ala Ser Glu Asn Thr Asn His Ala Val 530535 540 Tyr Asn Leu Tyr Ala Val Ser Asn His Ser Gly Thr Thr Met Gly 545550 555 Gly His Tyr Thr Ala Tyr Cys Arg Ser Pro Gly Thr Gly Glu Trp 560565 570 His Thr Phe Asn Asp Ser Ser Val Thr Pro Met Ser Ser Ser Gln 575580 585 Val Arg Thr Ser Asp Ala Tyr Leu Leu Phe Tyr Glu Leu Ala Ser 590595 600 Pro Pro Ser Arg Met 605 4 743 PRT Homo sapiens misc_featureIncyte ID No 2923874CD1 4 Met Leu Ile Ser Gly Ile Leu Trp Thr Phe MetHis Gln Lys Pro 1 5 10 15 Thr Ala Ser His Tyr Leu Gln Val Lys Ser GlnAsp Gly Ile Leu 20 25 30 Ser Pro Gly Lys Gly Leu Glu Asp Thr Asp Val ValTyr Lys Ser 35 40 45 Glu Asn Gly His Val Ile Lys Leu Asn Ile Glu Thr AsnAla Thr 50 55 60 Thr Leu Leu Leu Glu Asn Thr Thr Phe Val Thr Phe Lys AlaSer 65 70 75 Arg His Ser Val Ser Pro Asp Leu Lys Tyr Val Leu Leu Ala Tyr80 85 90 Asp Val Lys Gln Ile Phe His Tyr Ser Tyr Thr Ala Ser Tyr Val 95100 105 Ile Tyr Asn Ile His Thr Arg Glu Val Trp Glu Leu Asn Pro Pro 110115 120 Glu Val Glu Asp Ser Val Leu Gln Tyr Ala Ala Trp Gly Val Gln 125130 135 Gly Gln Gln Leu Ile Tyr Ile Phe Glu Asn Asn Ile Tyr Tyr Gln 140145 150 Pro Asp Ile Lys Ser Ser Ser Leu Arg Leu Thr Ser Ser Gly Lys 155160 165 Glu Glu Ile Ile Phe Asn Gly Ile Ala Asp Trp Leu Tyr Glu Glu 170175 180 Glu Leu Leu His Ser His Ile Ala His Trp Trp Ser Pro Asp Gly 185190 195 Glu Arg Leu Ala Phe Leu Met Ile Asn Asp Ser Leu Val Pro Thr 200205 210 Met Val Ile Pro Arg Phe Thr Gly Ala Leu Tyr Pro Lys Gly Lys 215220 225 Gln Tyr Pro Tyr Pro Lys Ala Gly Gln Val Asn Pro Thr Ile Lys 230235 240 Leu Tyr Val Val Asn Leu Tyr Gly Pro Thr His Thr Leu Glu Leu 245250 255 Met Pro Pro Asp Ser Phe Lys Ser Arg Glu Tyr Tyr Ile Thr Met 260265 270 Val Lys Trp Val Ser Asn Thr Lys Thr Val Val Arg Trp Leu Asn 275280 285 Arg Pro Gln Asn Ile Ser Ile Leu Thr Val Cys Glu Thr Thr Thr 290295 300 Gly Ala Cys Ser Lys Lys Tyr Glu Met Thr Ser Asp Thr Trp Leu 305310 315 Ser Gln Gln Asn Glu Glu Pro Val Phe Ser Arg Asp Gly Ser Lys 320325 330 Phe Phe Met Thr Val Pro Val Lys Gln Gly Gly Arg Gly Glu Phe 335340 345 His His Ile Ala Met Phe Leu Ile Gln Ser Lys Ser Glu Gln Ile 350355 360 Thr Val Arg His Leu Thr Ser Gly Asn Trp Glu Val Ile Lys Ile 365370 375 Leu Ala Tyr Asp Glu Thr Thr Gln Lys Ile Tyr Phe Leu Ser Thr 380385 390 Glu Ser Ser Pro Arg Gly Arg Gln Leu Tyr Ser Ala Ser Thr Glu 395400 405 Gly Leu Leu Asn Arg Gln Cys Ile Ser Cys Asn Phe Met Lys Glu 410415 420 Gln Cys Thr Tyr Phe Asp Ala Ser Phe Ser Pro Met Asn Gln His 425430 435 Phe Leu Leu Phe Cys Glu Gly Pro Arg Val Pro Val Val Ser Leu 440445 450 His Ser Thr Asp Asn Pro Ala Lys Tyr Phe Ile Leu Glu Ser Asn 455460 465 Ser Met Leu Lys Glu Ala Ile Leu Lys Lys Lys Ile Gly Lys Pro 470475 480 Glu Ile Lys Ile Leu His Ile Asp Asp Tyr Glu Leu Pro Leu Gln 485490 495 Leu Ser Leu Pro Lys Asp Phe Met Asp Arg Asn Gln Tyr Ala Leu 500505 510 Leu Leu Ile Met Asp Glu Glu Pro Gly Gly Gln Leu Val Thr Asp 515520 525 Lys Phe His Ile Asp Trp Asp Ser Val Leu Ile Asp Met Asp Asn 530535 540 Val Ile Val Ala Arg Phe Asp Gly Arg Gly Ser Gly Phe Gln Gly 545550 555 Leu Lys Ile Leu Gln Glu Ile His Arg Arg Leu Gly Ser Val Glu 560565 570 Val Lys Asp Gln Ile Thr Ala Val Lys Phe Leu Leu Lys Leu Pro 575580 585 Tyr Ile Asp Ser Lys Arg Leu Ser Ile Phe Gly Lys Gly Tyr Gly 590595 600 Gly Tyr Ile Ala Ser Met Ile Leu Lys Ser Asp Glu Lys Leu Phe 605610 615 Lys Cys Gly Ser Val Val Ala Pro Ile Thr Asp Leu Lys Leu Tyr 620625 630 Ala Ser Ala Phe Ser Glu Arg Tyr Leu Gly Met Pro Ser Lys Glu 635640 645 Glu Ser Thr Tyr Gln Ala Ala Ser Val Leu His Asn Val His Gly 650655 660 Leu Lys Glu Glu Asn Ile Leu Ile Ile His Gly Thr Ala Asp Thr 665670 675 Lys Val His Phe Gln His Ser Ala Glu Leu Ile Lys His Leu Ile 680685 690 Lys Ala Gly Val Asn Tyr Thr Met Gln Val Tyr Pro Asp Glu Gly 695700 705 His Asn Val Ser Glu Lys Ser Lys Tyr His Leu Tyr Ser Thr Ile 710715 720 Leu Lys Phe Phe Ser Asp Cys Leu Lys Glu Glu Ile Ser Val Leu 725730 735 Pro Gln Glu Pro Glu Glu Asp Glu 740 5 650 PRT Homo sapiensmisc_feature Incyte ID No 55122335CD1 5 Met Ala Ser Gly Glu His Ser ProGly Ser Gly Ala Ala Arg Arg 1 5 10 15 Pro Leu His Ser Ala Gln Ala ValAsp Val Ala Ser Ala Ser Asn 20 25 30 Phe Arg Ala Phe Glu Leu Leu His LeuHis Leu Asp Leu Arg Ala 35 40 45 Glu Phe Gly Pro Pro Gly Pro Gly Ala GlySer Arg Gly Leu Ser 50 55 60 Gly Thr Ala Val Leu Asp Leu Arg Cys Leu GluPro Glu Gly Ala 65 70 75 Ala Glu Leu Arg Leu Asp Ser His Pro Cys Leu GluVal Thr Ala 80 85 90 Ala Ala Leu Arg Arg Glu Arg Pro Gly Ser Glu Glu ProPro Ala 95 100 105 Glu Pro Val Ser Phe Tyr Thr Gln Pro Phe Ser His TyrGly Gln 110 115 120 Ala Leu Cys Val Ser Phe Pro Gln Pro Cys Arg Ala AlaGlu Arg 125 130 135 Leu Gln Val Leu Leu Thr Tyr Arg Val Gly Glu Gly ProGly Val 140 145 150 Cys Trp Leu Ala Pro Glu Gln Thr Ala Gly Lys Lys LysPro Phe 155 160 165 Val Tyr Thr Gln Gly Gln Ala Val Leu Asn Arg Ala PhePhe Pro 170 175 180 Cys Phe Asp Thr Pro Ala Val Lys Tyr Lys Tyr Ser AlaLeu Ile 185 190 195 Glu Val Pro Asp Gly Phe Thr Ala Val Met Ser Ala SerThr Trp 200 205 210 Glu Lys Arg Gly Pro Asn Lys Phe Phe Phe Gln Met CysGln Pro 215 220 225 Ile Pro Ser Tyr Leu Ile Ala Leu Ala Ile Gly Asp LeuVal Ser 230 235 240 Ala Glu Val Gly Pro Arg Ser Arg Val Trp Ala Glu ProCys Leu 245 250 255 Ile Asp Ala Ala Lys Glu Glu Tyr Asn Gly Val Ile GluGlu Phe 260 265 270 Leu Ala Thr Gly Glu Lys Leu Phe Gly Pro Tyr Val TrpGly Arg 275 280 285 Tyr Asp Leu Leu Phe Met Pro Pro Ser Phe Pro Phe GlyGly Met 290 295 300 Glu Asn Pro Cys Leu Thr Phe Val Thr Pro Cys Leu LeuAla Gly 305 310 315 Asp Arg Ser Leu Ala Asp Val Ile Ile His Glu Ile SerHis Ser 320 325 330 Trp Phe Gly Asn Leu Val Thr Asn Ala Asn Trp Gly GluPhe Trp 335 340 345 Leu Asn Glu Gly Phe Thr Met Tyr Ala Gln Arg Arg IleSer Thr 350 355 360 Ile Leu Phe Gly Ala Ala Tyr Thr Cys Leu Glu Ala AlaThr Gly 365 370 375 Arg Ala Leu Leu Arg Gln His Met Asp Ile Thr Gly GluGlu Asn 380 385 390 Pro Leu Asn Lys Leu Arg Val Lys Ile Glu Pro Gly ValAsp Pro 395 400 405 Asp Asp Thr Tyr Asn Glu Thr Pro Tyr Glu Lys Gly PheCys Phe 410 415 420 Val Ser Tyr Leu Ala His Leu Val Gly Asp Gln Asp GlnPhe Asp 425 430 435 Ser Phe Leu Lys Ala Tyr Val His Glu Phe Lys Phe ArgSer Ile 440 445 450 Leu Ala Asp Asp Phe Leu Asp Phe Tyr Leu Glu Tyr PhePro Glu 455 460 465 Leu Lys Lys Lys Arg Val Asp Ile Ile Pro Gly Phe GluPhe Asp 470 475 480 Arg Trp Leu Asn Thr Pro Gly Trp Pro Pro Tyr Leu ProAsp Leu 485 490 495 Ser Pro Gly Asp Ser Leu Met Lys Pro Ala Glu Glu LeuAla Gln 500 505 510 Leu Trp Ala Ala Glu Glu Leu Asp Met Lys Ala Ile GluAla Val 515 520 525 Ala Ile Ser Pro Trp Lys Thr Tyr Gln Leu Val Tyr PheLeu Asp 530 535 540 Lys Ile Leu Gln Lys Ser Pro Leu Pro Pro Gly Asn ValLys Lys 545 550 555 Leu Gly Asp Thr Tyr Pro Ser Ile Ser Asn Ala Arg AsnAla Glu 560 565 570 Leu Arg Leu Arg Trp Gly Gln Ile Ile Leu Lys Asn AspHis Gln 575 580 585 Glu Asp Phe Trp Lys Val Lys Glu Phe Leu His Asn GlnGly Lys 590 595 600 Gln Lys Tyr Thr Leu Pro Leu Tyr His Ala Met Met GlyGly Ser 605 610 615 Glu Val Ala Gln Thr Leu Ala Lys Glu Thr Phe Ala SerThr Ala 620 625 630 Ser Gln Leu His Ser Asn Val Val Asn Tyr Val Gln GlnIle Val 635 640 645 Ala Pro Lys Gly Ser 650 6 932 PRT Homo sapiensmisc_feature Incyte ID No 7473550CD1 6 Met Gly Leu Leu Ala Ser Ala GlyLeu Leu Leu Leu Leu Val Ile 1 5 10 15 Gly His Pro Arg Ser Leu Gly LeuLys Cys Gly Ile Arg Met Val 20 25 30 Asn Met Lys Ser Lys Glu Pro Ala ValGly Ser Arg Phe Phe Ser 35 40 45 Arg Ile Ser Ser Trp Arg Asn Ser Thr ValThr Gly His Pro Trp 50 55 60 Gln Val Tyr Leu Lys Ser Asp Glu His His PheCys Gly Gly Ser 65 70 75 Leu Ile Gln Glu Asp Arg Val Val Thr Ala Ala HisCys Leu His 80 85 90 Ser Leu Ser Glu Lys Gln Leu Lys Asn Ile Thr Val ThrSer Gly 95 100 105 Glu Tyr Ser Leu Phe Gln Lys Asp Lys Gln Glu Gln AsnIle Pro 110 115 120 Val Ser Lys Ile Ile Thr His Pro Glu Tyr Asn Ser ArgGlu Tyr 125 130 135 Met Ser Pro Asp Ile Ala Leu Leu Tyr Leu Lys His LysVal Lys 140 145 150 Phe Gly Asn Ala Val Gln Pro Ile Cys Leu Pro Asp SerAsp Asp 155 160 165 Lys Val Glu Pro Gly Ile Leu Cys Leu Ser Ser Gly TrpGly Lys 170 175 180 Ile Ser Lys Thr Ser Glu Tyr Ser Asn Val Leu Gln GluMet Glu 185 190 195 Leu Pro Ile Met Asp Asp Arg Ala Cys Asn Thr Val LeuLys Ser 200 205 210 Met Asn Leu Pro Pro Leu Gly Arg Thr Met Leu Cys AlaGly Phe 215 220 225 Pro Asp Trp Gly Met Asp Ala Cys Gln Gly Asp Ser GlyGly Pro 230 235 240 Leu Val Cys Arg Arg Gly Gly Gly Ile Trp Ile Leu AlaGly Ile 245 250 255 Thr Ser Trp Val Ala Gly Cys Ala Gly Gly Ser Val ProVal Arg 260 265 270 Asn Asn His Val Lys Ala Ser Leu Gly Ile Phe Ser LysVal Ser 275 280 285 Glu Leu Met Asp Phe Ile Thr Gln Asn Leu Phe Thr GlyLeu Asp 290 295 300 Arg Gly Gln Pro Leu Ser Lys Val Gly Ser Arg Tyr IleThr Lys 305 310 315 Ala Leu Ser Ser Val Gln Glu Val Asn Gly Ser Gln ArgAsp Lys 320 325 330 Ile Ile Leu Ile Lys Phe Thr Ser Leu Asp Met Glu LysGln Val 335 340 345 Gly Cys Asp His Asp Tyr Val Ser Leu Arg Ser Ser SerGly Val 350 355 360 Leu Phe Ser Lys Val Cys Gly Lys Ile Leu Pro Ser ProLeu Leu 365 370 375 Ala Glu Thr Ser Glu Ala Met Val Pro Phe Val Ser AspThr Glu 380 385 390 Asp Ser Gly Ser Gly Phe Glu Leu Thr Val Thr Ala ValGln Lys 395 400 405 Ser Glu Ala Gly Ser Gly Cys Gly Ser Leu Ala Ile LeuVal Glu 410 415 420 Glu Gly Thr Asn His Ser Ala Lys Tyr Pro Asp Leu TyrPro Ser 425 430 435 Asn Thr Arg Cys His Trp Phe Ile Cys Ala Pro Glu LysHis Ile 440 445 450 Ile Lys Leu Thr Phe Glu Asp Phe Ala Val Lys Phe SerPro Asn 455 460 465 Cys Ile Tyr Asp Ala Val Val Ile Tyr Gly Asp Ser GluGlu Lys 470 475 480 His Lys Leu Ala Lys Leu Cys Gly Met Leu Thr Ile ThrSer Ile 485 490 495 Phe Ser Ser Ser Asn Met Thr Val Ile Tyr Phe Lys SerAsp Gly 500 505 510 Lys Asn Arg Leu Gln Gly Phe Lys Ala Arg Phe Thr IleLeu Pro 515 520 525 Ser Glu Ser Leu Asn Lys Phe Glu Pro Lys Leu Pro ProGln Asn 530 535 540 Asn Pro Val Ser Thr Val Lys Ala Ile Leu His Asp ValCys Gly 545 550 555 Ile Pro Pro Phe Ser Pro Gln Trp Leu Ser Arg Arg IleAla Gly 560 565 570 Gly Glu Glu Ala Cys Pro His Cys Trp Pro Trp Gln ValGly Leu 575 580 585 Arg Phe Leu Gly Asp Tyr Gln Cys Gly Gly Ala Ile IleAsn Pro 590 595 600 Val Trp Ile Leu Thr Ala Ala His Cys Val Gln Leu LysAsn Asn 605 610 615 Pro Leu Ser Trp Thr Ile Ile Ala Gly Asp His Asp ArgAsn Leu 620 625 630 Lys Glu Ser Thr Glu Gln Val Arg Arg Ala Lys His IleIle Val 635 640 645 His Glu Asp Phe Asn Thr Leu Ser Tyr Asp Ser Asp IleAla Leu 650 655 660 Ile Gln Leu Ser Ser Pro Leu Glu Tyr Asn Ser Val ValArg Pro 665 670 675 Val Cys Leu Pro His Ser Ala Glu Pro Leu Phe Ser SerGlu Ile 680 685 690 Cys Ala Val Thr Gly Trp Gly Ser Ile Ser Ala Glu LeuSer Leu 695 700 705 Asn Val Ser Ser Leu Asp Gly Gly Leu Ala Ser Arg LeuGln Gln 710 715 720 Ile Gln Val His Val Leu Glu Arg Glu Val Cys Glu HisThr Tyr 725 730 735 Tyr Ser Ala His Pro Gly Gly Ile Thr Glu Lys Met IleCys Ala 740 745 750 Gly Phe Ala Ala Ser Gly Glu Lys Asp Phe Cys Gln GlyAsp Ser 755 760 765 Gly Gly Pro Leu Val Cys Arg His Glu Asn Gly Pro PheVal Leu 770 775 780 Tyr Gly Ile Val Ser Trp Gly Ala Gly Cys Val Gln ProTrp Lys 785 790 795 Pro Gly Val Phe Ala Arg Val Met Ile Phe Leu Asp TrpIle Gln 800 805 810 Ser Lys Ile Asn Gly Lys Leu Phe Ser Asn Val Ile LysThr Ile 815 820 825 Thr Ser Phe Phe Arg Val Gly Leu Gly Thr Val Ser CysCys Ser 830 835 840 Glu Ala Glu Leu Glu Lys Pro Arg Gly Phe Phe Pro ThrPro Arg 845 850 855 Tyr Leu Leu Asp Tyr Arg Gly Arg Leu Glu Cys Ser TrpVal Leu 860 865 870 Arg Val Ser Ala Ser Ser Met Ala Lys Phe Thr Ile GluTyr Leu 875 880 885 Ser Leu Leu Gly Ser Pro Val Cys Gln Asp Ser Val LeuIle Ile 890 895 900 Tyr Glu Glu Arg His Ser Lys Arg Lys Thr Ala Gly AsnPro Ser 905 910 915 Trp His Leu Pro Met Glu Ile Ser Ser Pro Phe Lys SerHis His 920 925 930 Ser Ala 7 990 PRT Homo sapiens misc_feature IncyteID No 7478108CD1 7 Met Gly Pro Pro Ser Ser Ser Gly Phe Tyr Val Ser ArgAla Val 1 5 10 15 Ala Leu Leu Leu Ala Gly Leu Val Ala Ala Leu Leu LeuAla Leu 20 25 30 Ala Val Leu Ala Ala Leu Tyr Gly His Cys Glu Arg Val ProPro 35 40 45 Ser Glu Leu Pro Gly Leu Arg Asp Ser Glu Ala Glu Ser Ser Pro50 55 60 Pro Leu Arg Gln Lys Pro Thr Pro Thr Pro Lys Pro Ser Ser Ala 6570 75 Arg Glu Leu Ala Val Thr Thr Thr Pro Ser Asn Trp Arg Pro Pro 80 8590 Gly Pro Trp Asp Gln Leu Arg Leu Pro Pro Trp Leu Val Pro Leu 95 100105 His Tyr Asp Leu Glu Leu Trp Pro Gln Leu Arg Pro Asp Glu Leu 110 115120 Pro Ala Gly Ser Leu Pro Phe Thr Gly Arg Val Asn Ile Thr Val 125 130135 Arg Cys Thr Val Ala Thr Ser Arg Leu Leu Leu His Ser Leu Phe 140 145150 Gln Asp Cys Glu Arg Ala Glu Val Arg Gly Pro Leu Ser Pro Gly 155 160165 Thr Gly Asn Ala Thr Val Gly Arg Val Pro Val Asp Asp Val Trp 170 175180 Phe Ala Leu Asp Thr Glu Tyr Met Val Leu Glu Leu Ser Glu Pro 185 190195 Leu Lys Pro Gly Ser Ser Tyr Glu Leu Gln Leu Ser Phe Ser Gly 200 205210 Leu Val Lys Glu Asp Leu Arg Glu Gly Leu Phe Leu Asn Val Tyr 215 220225 Thr Asp Gln Gly Glu Arg Arg Ala Leu Leu Ala Ser Gln Leu Glu 230 235240 Pro Thr Phe Ala Arg Tyr Val Phe Pro Cys Phe Asp Glu Pro Ala 245 250255 Leu Lys Ala Thr Phe Asn Ile Thr Met Ile His His Pro Ser Tyr 260 265270 Val Ala Leu Ser Asn Met Pro Lys Leu Gly Gln Ser Glu Lys Glu 275 280285 Asp Val Asn Gly Ser Lys Trp Thr Val Thr Thr Phe Ser Thr Thr 290 295300 Pro His Met Pro Thr Tyr Leu Val Ala Phe Val Ile Cys Asp Tyr 305 310315 Asp His Val Asn Arg Thr Glu Arg Gly Lys Glu Ile Arg Ile Trp 320 325330 Ala Arg Lys Asp Ala Ile Ala Asn Gly Ser Ala Asp Phe Ala Leu 335 340345 Asn Ile Thr Gly Pro Ile Phe Ser Phe Leu Glu Asp Leu Phe Asn 350 355360 Ile Ser Tyr Ser Leu Pro Lys Thr Asp Ile Ile Ala Leu Pro Ser 365 370375 Phe Asp Asn His Ala Met Glu Asn Trp Gly Leu Met Ile Phe Asp 380 385390 Glu Ser Gly Leu Leu Leu Glu Pro Lys Asp Gln Leu Thr Glu Lys 395 400405 Lys Thr Leu Ile Ser Tyr Val Val Ser His Glu Ile Gly His Gln 410 415420 Trp Phe Gly Asn Leu Val Thr Met Asn Trp Trp Asn Asn Ile Trp 425 430435 Leu Asn Glu Gly Phe Ala Ser Tyr Phe Glu Phe Glu Val Ile Asn 440 445450 Tyr Phe Asn Pro Lys Leu Pro Arg Asn Glu Ile Phe Phe Ser Asn 455 460465 Ile Leu His Asn Ile Leu Arg Glu Asp His Ala Leu Val Thr Arg 470 475480 Ala Val Ala Met Lys Val Glu Asn Phe Lys Thr Ser Glu Ile Gln 485 490495 Glu Leu Phe Asp Ile Phe Thr Tyr Ser Lys Gly Ala Ser Met Ala 500 505510 Arg Met Leu Ser Cys Phe Leu Asn Glu His Leu Phe Val Ser Ala 515 520525 Leu Lys Ser Tyr Leu Lys Thr Phe Ser Tyr Ser Asn Ala Glu Gln 530 535540 Asp Asp Leu Trp Arg His Phe Gln Met Ala Ile Asp Asp Gln Ser 545 550555 Thr Val Ile Leu Pro Ala Thr Ile Lys Asn Ile Met Asp Ser Trp 560 565570 Thr His Gln Ser Gly Phe Pro Val Ile Thr Leu Asn Val Ser Thr 575 580585 Gly Val Met Lys Gln Glu Pro Phe Tyr Leu Glu Asn Ile Lys Asn 590 595600 Arg Thr Leu Leu Thr Ser Asn Asp Thr Trp Ile Val Pro Ile Leu 605 610615 Trp Ile Lys Asn Gly Thr Thr Gln Pro Leu Val Trp Leu Asp Gln 620 625630 Ser Ser Lys Val Phe Pro Glu Met Gln Val Ser Asp Ser Asp His 635 640645 Asp Trp Val Ile Leu Asn Leu Asn Met Thr Gly Tyr Tyr Arg Val 650 655660 Asn Tyr Asp Lys Leu Gly Trp Lys Lys Leu Asn Gln Gln Leu Glu 665 670675 Lys Asp Pro Lys Ala Ile Pro Val Ile His Arg Leu Gln Phe Ile 680 685690 Asp Asp Ala Phe Ser Leu Ser Lys Asn Asn Tyr Ile Glu Ile Glu 695 700705 Thr Ala Leu Glu Leu Thr Lys Tyr Leu Ala Glu Glu Asp Glu Ile 710 715720 Ile Val Trp His Thr Val Leu Val Asn Leu Val Thr Arg Asp Leu 725 730735 Val Ser Glu Val Asn Ile Tyr Asp Ile Tyr Ser Leu Leu Lys Arg 740 745750 Tyr Leu Leu Lys Arg Leu Asn Leu Ile Trp Asn Ile Tyr Ser Thr 755 760765 Ile Ile Arg Glu Asn Val Leu Ala Leu Gln Asp Asp Tyr Leu Ala 770 775780 Leu Ile Ser Leu Glu Lys Leu Phe Val Thr Ala Cys Trp Leu Gly 785 790795 Leu Glu Asp Cys Leu Gln Leu Ser Lys Glu Leu Phe Ala Lys Trp 800 805810 Val Asp His Pro Glu Asn Glu Ile Pro Tyr Pro Ile Lys Asp Val 815 820825 Val Leu Cys Tyr Gly Ile Ala Leu Gly Ser Asp Lys Glu Trp Asp 830 835840 Ile Leu Leu Asn Thr Tyr Thr Asn Thr Thr Asn Lys Glu Glu Lys 845 850855 Ile Gln Leu Ala Tyr Ala Met Ser Cys Ser Lys Asp Pro Trp Ile 860 865870 Leu Asn Arg Tyr Met Glu Tyr Ala Ile Ser Thr Ser Pro Phe Thr 875 880885 Ser Asn Glu Thr Asn Ile Ile Glu Val Val Ala Ser Ser Glu Val 890 895900 Gly Arg Tyr Val Ala Lys Asp Phe Leu Val Asn Asn Trp Gln Ala 905 910915 Val Ser Lys Arg Tyr Gly Thr Gln Ser Leu Ile Asn Leu Ile Tyr 920 925930 Thr Ile Gly Arg Thr Val Thr Thr Asp Leu Gln Ile Val Glu Leu 935 940945 Gln Gln Phe Phe Ser Asn Met Leu Glu Glu His Gln Arg Ile Arg 950 955960 Val His Ala Asn Leu Gln Thr Ile Lys Asn Glu Asn Leu Lys Asn 965 970975 Lys Lys Leu Ser Ala Arg Ile Ala Ala Trp Leu Arg Arg Asn Thr 980 985990 8 396 PRT Homo sapiens misc_feature Incyte ID No 7482021CD1 8 MetArg Thr Ser Tyr Thr Val Thr Leu Pro Glu Asp Pro Pro Ala 1 5 10 15 AlaPro Phe Pro Ala Leu Ala Lys Glu Leu Arg Pro Arg Ser Pro 20 25 30 Leu SerPro Ser Leu Leu Leu Ser Thr Phe Val Gly Leu Leu Leu 35 40 45 Asn Lys AlaLys Asn Ser Lys Ser Ala Gln Gly Leu Ala Gly Leu 50 55 60 Arg Asn Leu GlyAsn Thr Cys Phe Met Asn Ser Ile Leu Gln Cys 65 70 75 Leu Ser Asn Thr ArgGlu Leu Arg Asp Tyr Cys Leu Gln Arg Leu 80 85 90 Tyr Met Arg Asp Leu HisHis Gly Ser Asn Ala His Thr Ala Leu 95 100 105 Val Glu Glu Phe Ala LysLeu Ile Gln Thr Ile Trp Thr Ser Ser 110 115 120 Pro Asn Asp Val Val SerPro Ser Glu Phe Lys Thr Gln Ile Gln 125 130 135 Arg Tyr Ala Pro Arg PheVal Gly Tyr Asn Gln Gln Asp Ala Gln 140 145 150 Glu Phe Leu Arg Phe LeuLeu Asp Gly Leu His Asn Glu Val Asn 155 160 165 Arg Val Thr Leu Arg ProLys Ser Asn Pro Glu Asn Leu Asp His 170 175 180 Leu Pro Asp Asp Glu LysGly Arg Gln Met Trp Arg Lys Tyr Leu 185 190 195 Glu Arg Glu Asp Ser ArgIle Gly Asp Leu Phe Val Gly Gln Leu 200 205 210 Lys Ser Ser Leu Thr CysThr Asp Cys Gly Tyr Cys Ser Thr Val 215 220 225 Phe Asp Pro Phe Trp AspLeu Ser Leu Pro Ile Ala Lys Arg Gly 230 235 240 Tyr Pro Glu Val Thr LeuMet Asp Cys Met Arg Leu Phe Thr Lys 245 250 255 Glu Asp Val Leu Asp GlyAsp Glu Lys Pro Thr Cys Cys Arg Cys 260 265 270 Arg Gly Arg Lys Arg CysIle Lys Lys Phe Ser Ile Gln Arg Phe 275 280 285 Pro Lys Ile Leu Val LeuHis Leu Lys Arg Phe Ser Glu Ser Arg 290 295 300 Ile Arg Thr Ser Lys LeuThr Thr Phe Val Asn Phe Pro Leu Arg 305 310 315 Asp Leu Asp Leu Arg GluPhe Ala Ser Glu Asn Thr Asn His Ala 320 325 330 Val Tyr Asn Leu Tyr AlaVal Ser Asn His Ser Gly Thr Thr Met 335 340 345 Gly Gly His Tyr Thr AlaTyr Cys Arg Ser Pro Gly Thr Gly Glu 350 355 360 Trp His Thr Phe Asn AspSer Ser Val Thr Pro Met Ser Ser Ser 365 370 375 Gln Val Arg Thr Ser AspAla Tyr Leu Leu Phe Tyr Glu Leu Ala 380 385 390 Ser Pro Pro Ser Arg Met395 9 250 PRT Homo sapiens misc_feature Incyte ID No 7482145CD1 9 MetAla Ser Arg Tyr Asp Arg Ala Ile Thr Val Phe Ser Pro Asp 1 5 10 15 GlyHis Leu Phe Gln Val Glu Tyr Ala Gln Glu Ala Val Lys Lys 20 25 30 Gly SerThr Ala Val Gly Ile Arg Gly Thr Asn Ile Val Val Leu 35 40 45 Gly Val GluLys Lys Ser Val Ala Lys Leu Gln Asp Glu Arg Thr 50 55 60 Val Arg Lys IleCys Ala Leu Asp Asp His Val Cys Met Ala Phe 65 70 75 Ala Gly Leu Thr AlaAsp Ala Arg Val Val Ile Asn Arg Ala Arg 80 85 90 Val Glu Cys Gln Ser HisLys Leu Thr Val Glu Asp Pro Val Thr 95 100 105 Val Glu Tyr Ile Thr ArgPhe Ile Ala Thr Leu Lys Gln Lys Tyr 110 115 120 Thr Gln Ser Asn Gly ArgArg Pro Phe Gly Ile Ser Ala Leu Ile 125 130 135 Val Gly Phe Asp Asp AspGly Ile Ser Arg Leu Tyr Gln Thr Asp 140 145 150 Pro Ser Gly Thr Tyr HisAla Trp Lys Ala Asn Ala Ile Gly Arg 155 160 165 Ser Ala Lys Thr Val ArgGlu Phe Leu Glu Lys Asn Tyr Thr Glu 170 175 180 Asp Ala Ile Ala Ser AspSer Glu Ala Ile Lys Leu Ala Ile Lys 185 190 195 Ala Leu Leu Glu Val ValGln Ser Gly Gly Lys Asn Ile Glu Leu 200 205 210 Ala Ile Ile Arg Arg AsnGln Pro Leu Lys Met Phe Ser Ala Lys 215 220 225 Glu Val Glu Leu Tyr ValThr Glu Ile Glu Lys Glu Lys Glu Glu 230 235 240 Ala Glu Lys Lys Lys SerLys Lys Ser Val 245 250 10 1045 PRT Homo sapiens misc_feature Incyte IDNo 55022586CD1 10 Met Thr Ala Glu Leu Gln Gln Asp Asp Ala Ala Gly AlaAla Asp 1 5 10 15 Gly His Gly Ser Ser Cys Gln Met Leu Leu Asn Gln LeuArg Glu 20 25 30 Ile Thr Gly Ile Gln Asp Pro Ser Phe Leu His Glu Ala LeuLys 35 40 45 Ala Ser Asn Gly Asp Ile Thr Gln Ala Val Ser Leu Leu Thr Asp50 55 60 Glu Arg Val Lys Glu Pro Ser Gln Asp Thr Val Ala Thr Glu Pro 6570 75 Ser Glu Val Glu Gly Ser Ala Ala Asn Lys Glu Val Leu Ala Lys 80 8590 Val Ile Asp Leu Thr His Asp Asn Lys Asp Asp Leu Gln Ala Ala 95 100105 Ile Ala Leu Ser Leu Leu Glu Ser Pro Lys Ile Gln Ala Asp Gly 110 115120 Arg Asp Leu Asn Arg Met His Glu Ala Thr Ser Ala Glu Thr Lys 125 130135 Arg Ser Lys Arg Lys Arg Cys Glu Val Trp Gly Glu Asn Pro Asn 140 145150 Pro Asn Asp Trp Arg Arg Val Asp Gly Trp Pro Val Gly Leu Lys 155 160165 Asn Val Gly Asn Thr Cys Trp Phe Ser Ala Val Ile Gln Ser Leu 170 175180 Phe Gln Leu Pro Glu Phe Arg Arg Leu Val Leu Ser Tyr Ser Leu 185 190195 Pro Gln Asn Val Leu Glu Asn Cys Arg Ser His Thr Glu Lys Arg 200 205210 Asn Ile Met Phe Met Gln Glu Leu Gln Tyr Leu Phe Ala Leu Met 215 220225 Met Gly Ser Asn Arg Lys Phe Val Asp Pro Ser Ala Ala Leu Asp 230 235240 Leu Leu Lys Gly Ala Phe Arg Ser Ser Glu Glu Gln Gln Gln Asp 245 250255 Val Ser Glu Phe Thr His Lys Leu Leu Asp Trp Leu Glu Asp Ala 260 265270 Phe Gln Leu Ala Val Asn Val Asn Ser Pro Arg Asn Lys Ser Glu 275 280285 Asn Pro Met Val Gln Leu Phe Tyr Gly Thr Phe Leu Thr Glu Gly 290 295300 Val Arg Glu Gly Lys Pro Phe Cys Asn Asn Glu Thr Phe Gly Gln 305 310315 Tyr Pro Leu Gln Val Asn Gly Tyr Arg Asn Leu Asp Glu Cys Leu 320 325330 Glu Gly Ala Met Val Glu Gly Asp Val Glu Leu Leu Pro Ser Asp 335 340345 His Ser Val Lys Tyr Gly Gln Glu Arg Trp Phe Thr Lys Leu Pro 350 355360 Pro Val Leu Thr Phe Glu Leu Ser Arg Phe Glu Phe Asn Gln Ser 365 370375 Leu Gly Gln Pro Glu Lys Ile His Asn Lys Leu Glu Phe Pro Gln 380 385390 Ile Ile Tyr Met Asp Arg Tyr Met Tyr Arg Ser Lys Glu Leu Ile 395 400405 Arg Asn Lys Arg Glu Cys Ile Arg Lys Leu Lys Glu Glu Ile Lys 410 415420 Ile Leu Gln Gln Lys Leu Glu Arg Tyr Val Lys Tyr Gly Ser Gly 425 430435 Pro Ala Arg Phe Pro Leu Pro Asp Met Leu Lys Tyr Val Ile Glu 440 445450 Phe Ala Ser Thr Lys Pro Ala Ser Glu Ser Cys Pro Pro Glu Ser 455 460465 Asp Thr His Met Thr Leu Pro Leu Ser Ser Val His Cys Ser Val 470 475480 Ser Asp Gln Thr Ser Lys Glu Ser Thr Ser Thr Glu Ser Ser Ser 485 490495 Gln Asp Val Glu Ser Thr Phe Ser Ser Pro Glu Asp Ser Leu Pro 500 505510 Lys Ser Lys Pro Leu Thr Ser Ser Arg Ser Ser Met Glu Met Pro 515 520525 Ser Gln Pro Ala Pro Arg Thr Val Thr Asp Glu Glu Ile Asn Phe 530 535540 Val Lys Thr Cys Leu Gln Arg Trp Arg Ser Glu Ile Glu Gln Asp 545 550555 Ile Gln Asp Leu Lys Thr Cys Ile Ala Ser Thr Thr Gln Thr Ile 560 565570 Glu Gln Met Tyr Cys Asp Pro Leu Leu Arg Gln Val Pro Tyr Arg 575 580585 Leu His Ala Val Leu Val His Glu Gly Gln Ala Asn Ala Gly His 590 595600 Tyr Trp Ala Tyr Ile Tyr Asn Gln Pro Arg Gln Ser Trp Leu Lys 605 610615 Tyr Asn Asp Ile Ser Val Thr Glu Ser Ser Trp Glu Glu Val Glu 620 625630 Arg Asp Ser Tyr Gly Gly Leu Arg Asn Val Ser Ala Tyr Cys Leu 635 640645 Met Tyr Ile Asn Asp Lys Leu Pro Tyr Phe Asn Ala Glu Ala Ala 650 655660 Pro Thr Glu Ser Asp Gln Met Ser Glu Val Glu Ala Leu Ser Val 665 670675 Glu Leu Lys His Tyr Ile Gln Glu Asp Asn Trp Arg Phe Glu Gln 680 685690 Glu Val Glu Glu Trp Glu Glu Glu Gln Ser Cys Lys Ile Pro Gln 695 700705 Met Glu Ser Ser Thr Asn Ser Ser Ser Gln Asp Tyr Ser Thr Ser 710 715720 Gln Glu Pro Ser Val Ala Ser Ser His Gly Val Arg Cys Leu Ser 725 730735 Ser Glu His Ala Val Ile Val Lys Glu Gln Thr Ala Gln Ala Ile 740 745750 Ala Asn Thr Ala Arg Ala Tyr Glu Lys Ser Gly Val Glu Ala Ala 755 760765 Leu Ser Glu Ala Phe His Glu Glu Tyr Ser Arg Leu Tyr Gln Leu 770 775780 Ala Lys Glu Thr Pro Thr Ser His Ser Asp Pro Arg Leu Gln His 785 790795 Val Leu Val Tyr Phe Phe Gln Asn Glu Ala Pro Lys Arg Val Val 800 805810 Glu Arg Thr Leu Leu Glu Gln Phe Ala Asp Lys Asn Leu Ser Tyr 815 820825 Asp Glu Arg Ser Ile Ser Ile Met Lys Val Ala Gln Ala Lys Leu 830 835840 Lys Glu Ile Gly Pro Asp Asp Met Asn Met Glu Glu Tyr Lys Lys 845 850855 Trp His Glu Asp Tyr Ser Leu Phe Arg Lys Val Ser Val Tyr Leu 860 865870 Leu Thr Gly Leu Glu Leu Tyr Gln Lys Gly Lys Tyr Gln Glu Ala 875 880885 Leu Ser Tyr Leu Val Tyr Ala Tyr Gln Ser Asn Ala Ala Leu Leu 890 895900 Met Lys Gly Pro Arg Arg Gly Val Lys Glu Ser Val Ile Ala Leu 905 910915 Tyr Arg Arg Lys Cys Leu Leu Glu Leu Asn Ala Lys Ala Ala Ser 920 925930 Leu Phe Glu Thr Asn Asp Asp His Ser Val Thr Glu Gly Ile Asn 935 940945 Val Met Asn Glu Leu Ile Ile Pro Cys Ile His Leu Ile Ile Asn 950 955960 Asn Asp Ile Ser Lys Asp Asp Leu Asp Ala Ile Glu Val Met Arg 965 970975 Asn His Trp Cys Ser Tyr Leu Gly Gln Asp Ile Ala Glu Asn Leu 980 985990 Gln Leu Cys Leu Gly Glu Phe Leu Pro Arg Leu Leu Asp Pro Ser 995 10001005 Ala Glu Ile Ile Val Leu Lys Glu Pro Pro Thr Ile Arg Pro Asn 10101015 1020 Ser Pro Tyr Asp Leu Cys Ser Arg Phe Ala Ala Val Met Glu Ser1025 1030 1035 Ile Gln Gly Val Ser Thr Val Thr Val Lys 1040 1045 11 622PRT Homo sapiens misc_feature Incyte ID No 3238072CD1 11 Met Glu Val ArgAsp Leu Tyr Val Phe Cys Tyr Leu Cys Lys Asp 1 5 10 15 Tyr Val Leu AsnAsp Asn Pro Glu Gly Asp Leu Lys Leu Leu Arg 20 25 30 Ser Ser Leu Leu AlaVal Arg Gly Gln Lys Gln Asp Thr Pro Val 35 40 45 Arg Arg Gly Arg Thr LeuArg Ser Met Ala Ser Gly Glu Asp Val 50 55 60 Val Leu Pro Gln Arg Ala ProGln Gly Gln Pro Gln Met Leu Thr 65 70 75 Ala Leu Trp Tyr Arg Arg Gln ArgLeu Leu Ala Arg Thr Leu Arg 80 85 90 Leu Trp Phe Glu Lys Ser Ser Arg GlyGln Ala Lys Leu Glu Gln 95 100 105 Arg Arg Gln Glu Glu Ala Leu Glu ArgLys Lys Glu Glu Ala Arg 110 115 120 Arg Arg Arg Arg Glu Val Lys Arg ArgLeu Leu Glu Glu Leu Ala 125 130 135 Ser Thr Pro Pro Arg Lys Ser Ala ArgLeu Leu Leu His Thr Pro 140 145 150 Arg Asp Ala Gly Pro Ala Ala Ser ArgPro Ala Ala Leu Pro Thr 155 160 165 Ser Arg Arg Val Pro Ala Ala Thr LeuLys Leu Arg Arg Gln Pro 170 175 180 Ala Met Ala Pro Gly Val Thr Gly LeuArg Asn Leu Gly Asn Thr 185 190 195 Cys Tyr Met Asn Ser Ile Leu Gln ValLeu Ser His Leu Gln Lys 200 205 210 Phe Arg Glu Cys Phe Leu Asn Leu AspPro Ser Lys Thr Glu His 215 220 225 Leu Phe Pro Lys Ala Thr Asn Gly LysThr Gln Leu Ser Gly Lys 230 235 240 Pro Thr Asn Ser Ser Ala Thr Glu LeuSer Leu Arg Asn Asp Arg 245 250 255 Ala Glu Ala Cys Glu Arg Glu Gly PheCys Trp Asn Gly Arg Ala 260 265 270 Ser Ile Ser Arg Ser Leu Glu Leu IleGln Asn Lys Glu Pro Ser 275 280 285 Ser Lys His Ile Ser Leu Cys Arg GluLeu His Thr Leu Phe Arg 290 295 300 Val Met Trp Ser Gly Lys Trp Ala LeuVal Ser Pro Phe Ala Met 305 310 315 Leu His Ser Val Trp Ser Leu Ile ProAla Phe Arg Gly Tyr Asp 320 325 330 Gln Gln Asp Ala Gln Glu Phe Leu CysGlu Leu Leu His Lys Val 335 340 345 Gln Gln Glu Leu Glu Ser Glu Gly ThrThr Arg Arg Ile Leu Ile 350 355 360 Pro Phe Ser Gln Arg Lys Leu Thr LysGln Val Leu Lys Val Val 365 370 375 Asn Thr Ile Phe His Gly Gln Leu LeuSer Gln Val Thr Cys Ile 380 385 390 Ser Cys Asn Tyr Lys Ser Asn Thr IleGlu Pro Phe Trp Asp Leu 395 400 405 Ser Leu Glu Phe Pro Glu Arg Tyr HisCys Ile Glu Lys Gly Phe 410 415 420 Val Pro Leu Asn Gln Thr Glu Cys LeuLeu Thr Glu Met Leu Ala 425 430 435 Lys Phe Thr Glu Thr Glu Ala Leu GluGly Arg Ile Tyr Ala Cys 440 445 450 Asp Gln Cys Asn Ser Lys Arg Arg LysSer Asn Pro Lys Pro Leu 455 460 465 Val Leu Ser Glu Ala Arg Lys Gln LeuMet Ile Tyr Arg Leu Pro 470 475 480 Gln Val Leu Arg Leu His Leu Lys ArgPhe Arg Trp Ser Gly Arg 485 490 495 Asn His Arg Glu Lys Ile Gly Val HisVal Val Phe Asp Gln Val 500 505 510 Leu Thr Met Glu Pro Tyr Cys Cys ArgAsp Met Leu Ser Ser Leu 515 520 525 Asp Lys Glu Thr Phe Ala Tyr Asp LeuSer Ala Val Val Met His 530 535 540 His Gly Lys Gly Phe Gly Ser Gly HisTyr Thr Ala Tyr Cys Tyr 545 550 555 Asn Thr Glu Gly Gly Phe Trp Val HisCys Asn Asp Ser Lys Leu 560 565 570 Asn Val Cys Ser Val Glu Glu Val CysLys Thr Gln Ala Tyr Ile 575 580 585 Leu Phe Tyr Thr Gln Arg Thr Val GlnGly Asn Ala Arg Ile Ser 590 595 600 Glu Thr His Leu Gln Ala Gln Val GlnSer Ser Asn Asn Asp Glu 605 610 615 Gly Arg Pro Gln Thr Phe Ser 620 12345 PRT Homo sapiens misc_feature Incyte ID No 7482034CD1 12 Met Lys ArgGln Leu Thr His Leu Pro Gly Arg Phe Trp Leu Trp 1 5 10 15 Pro Ser PheSer Val Ala Ser Leu Leu Ser His Gln Thr Pro Ala 20 25 30 Thr Asn Ser TrpLeu Ala Ser Ser Lys Leu His Ser Ala Pro Gly 35 40 45 Met Ala Leu Gln AspVal Cys Lys Trp Gln Ser Pro Asp Thr Gln 50 55 60 Gly Pro Ser Pro His LeuPro Arg Ala Gly Gly Trp Ala Val Pro 65 70 75 Arg Gly Cys Asp Pro Gln ThrPhe Leu Gln Ile His Gly Pro Arg 80 85 90 Leu Ala His Gly Thr Thr Thr LeuAla Phe Arg Phe Arg His Gly 95 100 105 Val Ile Ala Ala Ala Asp Thr ArgSer Ser Cys Gly Ser Tyr Val 110 115 120 Ala Cys Pro Ala Ser Cys Lys ValIle Pro Val His Gln His Leu 125 130 135 Leu Gly Thr Thr Ser Gly Thr SerAla Asp Cys Ala Thr Trp Tyr 140 145 150 Arg Val Leu Gln Arg Glu Leu ArgLeu Arg Glu Leu Arg Glu Gly 155 160 165 Gln Leu Pro Ser Val Ala Ser AlaAla Lys Leu Leu Ser Ala Met 170 175 180 Met Ser Gln Tyr Arg Gly Leu AspLeu Cys Val Ala Thr Ala Leu 185 190 195 Cys Gly Trp Asp Arg Ser Gly ProGlu Leu Phe Tyr Val Tyr Ser 200 205 210 Asp Gly Thr Arg Leu Gln Gly AspIle Phe Ser Val Gly Ser Gly 215 220 225 Ser Pro Tyr Ala Tyr Gly Val LeuAsp Arg Gly Tyr Arg Tyr Asp 230 235 240 Met Ser Thr Gln Glu Ala Tyr AlaLeu Ala Arg Cys Ala Val Ala 245 250 255 His Ala Thr His Arg Asp Ala TyrSer Gly Gly Ser Val Asp Leu 260 265 270 Phe His Val Arg Glu Ser Gly TrpGlu His Val Ser Arg Ser Asp 275 280 285 Ala Cys Val Leu Tyr Val Glu LeuGln Lys Leu Leu Glu Pro Glu 290 295 300 Pro Glu Glu Asp Ala Ser His AlaHis Pro Glu Pro Ala Thr Ala 305 310 315 His Arg Ala Ala Glu Asp Arg GluLeu Ser Val Gly Pro Gly Glu 320 325 330 Val Thr Pro Gly Asp Ser Arg MetPro Ala Gly Thr Glu Thr Val 335 340 345 13 948 PRT Homo sapiensmisc_feature Incyte ID No 7474351CD1 13 Met Val Ser Lys Gly Gly Val AlaAla Glu Pro Glu Pro His Tyr 1 5 10 15 Cys Glu Asp Ser Glu Arg Gly ProAsn Thr Leu Thr Gly Pro Gly 20 25 30 Ser Leu Pro Arg Gly Gly Gly Ile GluVal Gly Met Glu Phe Pro 35 40 45 Gly Cys Ser Gly Glu Gly Cys Val Lys ProHis Glu Glu Ala Ala 50 55 60 Arg Glu Gly Ala Gly Arg Gly Lys Arg Ala ValPro Gly Pro Lys 65 70 75 Arg Arg Gln Gln Gly Ser Ala Glu Gly Pro Ala AlaGly Trp Thr 80 85 90 Leu Glu Gln Glu Thr Arg Gly Asp Val Leu Glu Asp LysAsn Glu 95 100 105 Arg Ala Asp Glu Glu Ile Leu Arg Leu Ala Pro Gly LysGly Arg 110 115 120 Leu Pro Ile Asp Ser Lys His Leu Lys Pro Val Ile SerSer Phe 125 130 135 Pro Val Arg Ser Gln Glu Leu Gly Glu Gly Ala Gly AlaGly Thr 140 145 150 Leu Arg Gly Lys Met Ala Glu Phe Asn Trp Ser Met AlaPhe Lys 155 160 165 Gly Pro Ala Ala Gly His Glu Glu Arg Leu Asn Ser ValSer Ser 170 175 180 Arg Ala Lys Lys Gly Ile Gly Trp Asp Val Ala Ala AlaSer Leu 185 190 195 Arg Gly Val Asp His Phe Ser Asp Leu Pro Pro Pro LeuGln Val 200 205 210 Arg Glu Glu Leu Glu Ala Cys Ala Phe Arg Val Gln ValGly Gln 215 220 225 Leu Arg Leu Tyr Glu Asp Asp Gln Arg Thr Lys Val ValGlu Ile 230 235 240 Val Arg His Pro Gln Tyr Asn Glu Ser Leu Ser Ala GlnGly Gly 245 250 255 Ala Asp Ile Ala Leu Leu Lys Leu Glu Ala Pro Val ProLeu Ser 260 265 270 Glu Leu Ile His Pro Val Ser Leu Pro Ser Ala Ser LeuAsp Val 275 280 285 Pro Ser Gly Lys Thr Cys Trp Val Thr Gly Trp Gly ValIle Gly 290 295 300 Arg Gly Glu Leu Leu Pro Trp Pro Leu Ser Leu Trp GluAla Thr 305 310 315 Val Lys Val Arg Ser Asn Val Leu Cys Asn Gln Thr CysArg Arg 320 325 330 Arg Phe Pro Ser Asn His Thr Glu Arg Phe Glu Arg LeuIle Lys 335 340 345 Asp Asp Met Leu Cys Ala Gly Asp Gly Asn His Gly SerTrp Pro 350 355 360 Gly Asp Asn Gly Gly Pro Leu Leu Cys Arg Arg Asn CysThr Trp 365 370 375 Val Gln Val Glu Val Val Ser Trp Gly Lys Leu Cys GlyLeu Arg 380 385 390 Gly Tyr Pro Gly Met Tyr Thr Arg Val Thr Ser Tyr ValSer Trp 395 400 405 Ile Arg Gln Pro Cys Pro Ser Ala Gln Thr Pro Ala ValVal Arg 410 415 420 Arg Phe Val Leu Pro Pro Asn Pro Asp Val Glu Ala LeuThr Pro 425 430 435 Ser Val Met Gly Ser Gly Ala Pro Leu Pro Pro Ala ProAsp Leu 440 445 450 Gln Glu Ala Glu Val Pro Ile Met Arg Thr Arg Ala CysGlu Arg 455 460 465 Met Tyr His Lys Gly Pro Thr Ala His Gly Gln Val ThrIle Ile 470 475 480 Lys Ala Ala Met Pro Cys Ala Gly Arg Lys Gly Gln GlySer Cys 485 490 495 Gln Ala Ala Leu Arg Thr Glu Asp Leu Thr Pro Thr ThrPro Asn 500 505 510 Thr Glu Val Ser Pro Arg Ala Asp Pro Arg Leu Ser GlnPro Glu 515 520 525 Asp Ile Trp Pro Glu Trp Ala Trp Pro Val Val Val GlyThr Thr 530 535 540 Met Leu Leu Leu Leu Leu Phe Leu Ala Val Ser Ser LeuGly Ser 545 550 555 Cys Ser Thr Gly Ser Pro Ala Pro Val Pro Glu Asn AspLeu Val 560 565 570 Gly Ile Val Gly Gly His Asn Thr Pro Gly Glu Val ValVal Ala 575 580 585 Val Gly Ala Asp Arg Arg Ser Leu His Phe Pro Glu GlyHis Arg 590 595 600 Pro Val His Leu Pro Asp Ser His Gln Gly Cys Val SerVal Arg 605 610 615 Gly Pro Gly Ala Ala Glu Cys Gln Pro Asp Arg Arg ProPro Asn 620 625 630 Tyr Ser Val Phe Phe Leu Gly Ala Asp Ile Ala Leu LeuLys Leu 635 640 645 Ala Thr Ser Ser Leu Glu Phe Thr Asp Ser Asp Asn CysTrp Asn 650 655 660 Thr Gly Trp Gly Met Val Gly Leu Leu Asp Met Leu ProPro Pro 665 670 675 Tyr Arg Pro Gln Gln Val Lys Val Leu Thr Leu Ser AsnAla Asp 680 685 690 Cys Glu Arg Gln Thr Tyr Asp Ala Phe Pro Gly Ala GlyAsp Arg 695 700 705 Lys Phe Ile Gln Asp Asp Met Ile Cys Ala Gly Arg ThrGly Arg 710 715 720 Arg Thr Trp Lys Gly Asp Ser Gly Gly Pro Leu Val CysLys Lys 725 730 735 Lys Gly Thr Trp Leu Gln Ala Gly Val Val Ser Trp GlyPhe Tyr 740 745 750 Ser Asp Arg Pro Ser Ile Gly Val Tyr Thr Arg Pro GluThr Ser 755 760 765 Trp Gln Gly Ala Asn His Ala Asp Ala Gln Arg Pro AlaGly Arg 770 775 780 Val Pro Thr Met Gln Arg Pro Arg Asp Met Gly Gln GlyGln Glu 785 790 795 Trp Val Cys Arg Pro Phe Thr His Val Thr Cys Tyr ProThr Ala 800 805 810 Ile Pro Arg Pro Phe Thr His Val Thr Cys Tyr Leu MetAla Val 815 820 825 Pro Ser Thr Leu Thr His Val Thr Cys Tyr Pro Thr AlaVal Pro 830 835 840 Arg Pro Phe Thr His Val Thr Cys Tyr Leu Met Ala ValPro Ser 845 850 855 Thr Leu Thr His Ile Thr Cys Tyr Met Met Ala Val ProArg Pro 860 865 870 Phe Thr His Ile Thr Cys Tyr Pro Met Ala Val Pro SerThr Leu 875 880 885 Thr His Val Thr Cys His Pro Thr Ala Ile Pro Arg ProPhe Thr 890 895 900 His Ile Thr Cys Tyr Thr Met Ala Ile Pro Arg Pro SerThr Thr 905 910 915 Pro Pro Ala Thr Arg Arg Pro Ser Pro Ala Pro Ser ProThr Ser 920 925 930 Pro Ala Thr Arg Trp Pro Ser Pro Gly Pro Ser Pro MetSer Pro 935 940 945 Ala Thr Arg 14 444 PRT Homo sapiens misc_featureIncyte ID No 2232483CD1 14 Met Gly Ser Ala Pro Trp Ala Pro Val Leu LeuLeu Ala Leu Gly 1 5 10 15 Leu Arg Gly Leu Gln Ala Gly Ala Arg Arg AlaPro Asp Pro Gly 20 25 30 Phe Gln Glu Arg Phe Phe Gln Gln Arg Leu Asp HisPhe Asn Phe 35 40 45 Glu Arg Phe Gly Asn Arg Thr Phe Pro Gln Arg Phe LeuVal Ser 50 55 60 Asp Arg Phe Trp Val Arg Gly Glu Gly Pro Ile Phe Phe TyrThr 65 70 75 Gly Asn Glu Gly Asp Val Trp Ala Phe Ala Asn Asn Ser Ala Phe80 85 90 Val Ala Glu Leu Ala Ala Glu Arg Gly Ala Leu Leu Val Phe Ala 95100 105 Glu His Arg Tyr Tyr Gly Lys Ser Leu Pro Phe Gly Ala Gln Ser 110115 120 Thr Gln Arg Gly His Thr Glu Leu Leu Thr Val Glu Gln Ala Leu 125130 135 Ala Asp Phe Ala Glu Leu Leu Arg Ala Leu Arg Arg Asp Leu Gly 140145 150 Ala Gln Asp Ala Pro Ala Ile Ala Phe Gly Gly Ser Tyr Gly Gly 155160 165 Met Leu Ser Ala Tyr Leu Arg Met Lys Tyr Pro His Leu Val Ala 170175 180 Gly Ala Leu Ala Ala Ser Ala Pro Val Leu Ala Val Ala Gly Leu 185190 195 Gly Asp Ser Asn Gln Phe Phe Arg Asp Val Thr Ala Gly Ala Tyr 200205 210 Asp Thr Val Arg Trp Glu Phe Gly Thr Cys Gln Pro Leu Ser Asp 215220 225 Glu Lys Asp Leu Thr Gln Leu Phe Met Phe Ala Arg Asn Ala Phe 230235 240 Thr Val Leu Ala Met Met Asp Tyr Pro Tyr Pro Thr Asp Phe Leu 245250 255 Gly Pro Leu Pro Ala Asn Pro Val Lys Val Gly Cys Asp Arg Leu 260265 270 Leu Ser Glu Ala Gln Arg Ile Thr Gly Leu Arg Ala Leu Ala Gly 275280 285 Leu Val Tyr Asn Ala Ser Gly Ser Glu His Cys Tyr Asp Ile Tyr 290295 300 Arg Leu Tyr His Ser Cys Ala Asp Pro Thr Gly Cys Gly Thr Gly 305310 315 Pro Asp Ala Arg Ala Trp Asp Tyr Gln Ala Cys Thr Glu Ile Asn 320325 330 Leu Thr Phe Ala Ser Asn Asn Val Thr Asp Met Phe Pro Asp Leu 335340 345 Pro Phe Thr Asp Glu Leu Arg Pro Ser Asp Leu Arg Ala Ala Ser 350355 360 Asn Ile Ile Phe Ser Asn Gly Asn Leu Asp Pro Cys Gly Arg Gly 365370 375 Gly Ile Arg Arg Asn Leu Ser Ala Ser Val Ile Ala Val Thr Ile 380385 390 Gln Gly Gly Ala His His Leu Asp Leu Arg Ala Ser His Pro Glu 395400 405 Asp Pro Ala Ser Val Val Glu Ala Arg Lys Leu Glu Ala Thr Ile 410415 420 Ile Gly Glu Cys Val Lys Ala Ala Arg Arg Glu Gln Gln Pro Ala 425430 435 Leu Arg Trp Gly Ala Gln Ile Ser Leu 440 15 514 PRT Homo sapiensmisc_feature Incyte ID No 7481712CD1 15 Met Arg Val Pro Phe Ser Glu LeuLys Asp Ile Lys Ala Tyr Leu 1 5 10 15 Glu Ser His Gly Leu Ala Tyr SerIle Met Ile Lys Asp Ile Gln 20 25 30 Val Lys Pro Cys Pro Ser Trp Asp ProAla Phe Arg Leu Pro Phe 35 40 45 Trp Leu Gly Pro Asn Met Glu Glu Met PheSer Gly Leu Lys Val 50 55 60 Asp Met Trp Phe Leu Gly Leu His Gln Arg ValCys Glu His Ala 65 70 75 Val Glu Gly Thr Gly Cys Pro Pro Pro His Phe ThrLys Ala Ser 80 85 90 Leu Asp Asn Val Thr Arg Asn Phe Gln Ile Gln Pro AspGly Arg 95 100 105 Leu Ser Met Phe Leu Phe Gln Gln His Asn Trp Ser LeuSer Pro 110 115 120 Ser Trp Ser Leu Ser Leu Pro Leu Ala Ser Arg Thr SerVal Phe 125 130 135 Cys Leu Gln Pro Ala Pro Pro Leu Leu Asp Pro Thr AlaTyr Ser 140 145 150 Val Phe Pro Pro Gly Gly Ala Met Gly Ile Ser Asn PhePro Ala 155 160 165 Pro Gly Met Glu Gln Thr Leu Val His Phe Pro Gly GlnGly Arg 170 175 180 Phe Leu Phe Leu Glu Val Gly Pro Ala Val Leu Leu AspGlu Glu 185 190 195 Arg Gln Ala Met Ala Lys Ser Arg Arg Leu Glu Arg SerThr Asn 200 205 210 Ser Phe Ser Tyr Ser Ser Tyr His Thr Leu Glu Glu IleTyr Ser 215 220 225 Trp Ile Asp Asn Phe Val Met Glu His Ser Asp Ile ValSer Lys 230 235 240 Ile Gln Ile Gly Asn Ser Phe Glu Asn Gln Ser Ile LeuVal Leu 245 250 255 Lys Phe Ser Thr Gly Gly Ser Arg His Pro Ala Ile TrpIle Asp 260 265 270 Thr Gly Ile His Ser Arg Glu Trp Ile Thr His Ala ThrGly Ile 275 280 285 Trp Thr Ala Asn Lys Ile Val Ser Asp Tyr Gly Lys AspArg Val 290 295 300 Leu Thr Asp Ile Leu Asn Ala Met Asp Ile Phe Ile GluLeu Val 305 310 315 Thr Asn Pro Asp Gly Phe Ala Phe Thr His Ser Met AsnArg Leu 320 325 330 Trp Arg Lys Asn Lys Ser Ile Arg Pro Gly Ile Phe CysIle Gly 335 340 345 Val Asp Leu Asn Arg Asn Trp Lys Ser Gly Phe Gly GlyAsn Gly 350 355 360 Ser Asn Ser Asn Pro Cys Ser Glu Thr Tyr His Gly ProSer Pro 365 370 375 Gln Ser Glu Pro Glu Val Ala Ala Ile Val Asn Phe IleThr Ala 380 385 390 His Gly Asn Phe Lys Ala Leu Ile Ser Ile His Ser TyrSer Gln 395 400 405 Met Leu Met Tyr Pro Tyr Gly Arg Leu Leu Glu Pro ValSer Asn 410 415 420 Gln Arg Glu Leu Tyr Asp Leu Ala Lys Asp Ala Val GluAla Leu 425 430 435 Tyr Lys Val His Gly Ile Glu Tyr Ile Phe Gly Ser IleSer Thr 440 445 450 Thr Leu Tyr Val Ala Ser Gly Ile Thr Val Asp Trp AlaTyr Asp 455 460 465 Ser Gly Ile Lys Tyr Ala Phe Ser Phe Glu Leu Arg AspThr Gly 470 475 480 Gln Tyr Gly Phe Leu Leu Pro Ala Thr Gln Ile Ile ProThr Ala 485 490 495 Gln Glu Thr Trp Met Ala Leu Arg Thr Ile Met Glu HisThr Leu 500 505 510 Asn His Pro Tyr 16 787 PRT Homo sapiens misc_featureIncyte ID No 8213480CD1 16 Met Gly Trp Arg Pro Arg Arg Ala Arg Gly ThrPro Leu Leu Leu 1 5 10 15 Leu Leu Leu Leu Leu Leu Leu Trp Pro Val ProGly Ala Gly Val 20 25 30 Leu Gln Gly His Ile Pro Gly Gln Pro Val Thr ProHis Trp Val 35 40 45 Leu Asp Gly Gln Pro Trp Arg Thr Val Ser Leu Glu GluPro Val 50 55 60 Ser Lys Pro Asp Met Gly Leu Val Ala Leu Glu Ala Glu GlyGln 65 70 75 Glu Leu Leu Leu Glu Leu Glu Lys Asn His Arg Leu Leu Ala Pro80 85 90 Gly Tyr Ile Glu Thr His Tyr Gly Pro Asp Gly Gln Pro Val Val 95100 105 Leu Ala Pro Asn His Thr Asp His Cys His Tyr Gln Gly Arg Val 110115 120 Arg Gly Phe Pro Asp Ser Trp Val Val Leu Cys Thr Cys Ser Gly 125130 135 Met Ser Gly Leu Ile Thr Leu Ser Arg Asn Ala Ser Tyr Tyr Leu 140145 150 Arg Pro Trp Pro Pro Arg Gly Ser Lys Asp Phe Ser Thr His Glu 155160 165 Ile Phe Arg Met Glu Gln Leu Leu Thr Trp Lys Gly Thr Cys Gly 170175 180 His Arg Asp Pro Gly Asn Lys Ala Gly Met Thr Ser Leu Pro Gly 185190 195 Gly Pro Gln Ser Arg Gly Arg Arg Glu Ala Arg Arg Thr Arg Lys 200205 210 Tyr Leu Glu Leu Tyr Ile Val Ala Asp His Thr Leu Phe Leu Thr 215220 225 Arg His Arg Asn Leu Asn His Thr Lys Gln Arg Leu Leu Glu Val 230235 240 Ala Asn Tyr Val Asp Gln Leu Leu Arg Thr Leu Asp Ile Gln Val 245250 255 Ala Leu Thr Gly Leu Glu Val Trp Thr Glu Arg Asp Arg Ser Arg 260265 270 Val Thr Gln Asp Ala Asn Ala Thr Leu Trp Ala Phe Leu Gln Trp 275280 285 Arg Arg Gly Leu Trp Ala Gln Arg Pro His Asp Ser Ala Gln Leu 290295 300 Leu Thr Gly Arg Ala Phe Gln Gly Ala Thr Val Gly Leu Ala Pro 305310 315 Val Glu Gly Met Cys Arg Ala Glu Ser Ser Gly Gly Val Ser Thr 320325 330 Asp His Ser Glu Leu Pro Ile Gly Ala Ala Ala Thr Met Ala His 335340 345 Glu Ile Gly His Ser Leu Gly Leu Ser His Asp Pro Asp Gly Cys 350355 360 Cys Val Glu Ala Ala Ala Glu Ser Gly Gly Cys Val Met Ala Ala 365370 375 Ala Thr Gly His Pro Phe Pro Arg Val Phe Ser Ala Cys Ser Arg 380385 390 Arg Gln Leu Arg Ala Phe Phe Arg Lys Gly Gly Gly Ala Cys Leu 395400 405 Ser Asn Ala Pro Asp Pro Gly Leu Pro Val Pro Pro Ala Leu Cys 410415 420 Gly Asn Gly Phe Val Glu Ala Gly Glu Glu Cys Asp Cys Gly Pro 425430 435 Gly Gln Glu Cys Arg Asp Leu Cys Cys Phe Ala His Asn Cys Ser 440445 450 Leu Arg Pro Gly Ala Gln Cys Ala His Gly Asp Cys Cys Val Arg 455460 465 Cys Leu Leu Lys Pro Ala Gly Ala Leu Cys Arg Gln Ala Met Gly 470475 480 Asp Cys Asp Leu Pro Glu Phe Cys Thr Gly Thr Ser Ser His Cys 485490 495 Pro Pro Asp Val Tyr Leu Leu Asp Gly Ser Pro Cys Ala Arg Gly 500505 510 Ser Gly Tyr Cys Trp Asp Gly Ala Cys Pro Thr Leu Glu Gln Gln 515520 525 Cys Gln Gln Leu Trp Gly Pro Gly Ser His Pro Ala Pro Glu Ala 530535 540 Cys Phe Gln Val Val Asn Ser Ala Gly Asp Ala His Gly Asn Cys 545550 555 Gly Gln Asp Ser Glu Gly His Phe Leu Pro Cys Ala Gly Arg Asp 560565 570 Ala Leu Cys Gly Lys Leu Gln Cys Gln Gly Gly Lys Pro Ser Leu 575580 585 Leu Ala Pro His Met Val Pro Val Asp Ser Thr Val His Leu Asp 590595 600 Gly Gln Glu Val Thr Cys Arg Gly Ala Leu Ala Leu Pro Ser Ala 605610 615 Gln Leu Asp Leu Leu Gly Leu Gly Leu Val Glu Pro Gly Thr Gln 620625 630 Cys Gly Pro Arg Met Val Cys Asn Ser Asn His Asn Cys His Cys 635640 645 Ala Pro Gly Trp Ala Pro Pro Phe Cys Asp Lys Pro Gly Phe Gly 650655 660 Gly Ser Met Asp Ser Gly Pro Val Gln Ala Glu Asn His Asp Thr 665670 675 Phe Leu Leu Ala Met Leu Leu Ser Val Leu Leu Pro Leu Leu Pro 680685 690 Gly Ala Gly Leu Ala Trp Cys Cys Tyr Arg Leu Pro Gly Ala His 695700 705 Leu Gln Arg Cys Ser Trp Gly Cys Arg Arg Asp Pro Ala Cys Ser 710715 720 Gly Pro Lys Asp Gly Pro His Arg Asp His Pro Leu Gly Gly Val 725730 735 His Pro Thr Glu Leu Gly Pro Thr Ala Thr Gly Gln Ser Trp Pro 740745 750 Leu Asp Pro Glu Asn Ser His Glu Pro Ser Ser His Pro Glu Lys 755760 765 Pro Leu Pro Ala Val Ser Pro Asp Pro Gln Ala Asp Gln Val Gln 770775 780 Met Pro Arg Ser Cys Leu Trp 785 17 1082 PRT Homo sapiensmisc_feature Incyte ID No 7478405CD1 17 Met Glu Cys Ala Leu Leu Leu AlaCys Ala Phe Pro Ala Ala Gly 1 5 10 15 Ser Gly Pro Pro Arg Gly Leu AlaGly Leu Gly Arg Val Ala Lys 20 25 30 Ala Leu Gln Leu Cys Cys Leu Cys CysAla Ser Val Ala Ala Ala 35 40 45 Leu Ala Ser Asp Ser Ser Ser Gly Ala SerGly Leu Asn Asp Asp 50 55 60 Tyr Val Phe Val Thr Pro Val Glu Val Asp SerAla Gly Ser Tyr 65 70 75 Ile Ser His Asp Ile Leu His Asn Gly Arg Lys LysArg Ser Ala 80 85 90 Gln Asn Ala Arg Ser Ser Leu His Tyr Arg Phe Ser AlaPhe Gly 95 100 105 Gln Glu Leu His Leu Glu Leu Lys Pro Ser Ala Ile LeuSer Ser 110 115 120 His Phe Ile Val Gln Val Leu Gly Lys Asp Gly Ala SerGlu Thr 125 130 135 Gln Lys Pro Glu Val Gln Gln Cys Phe Tyr Gln Gly PheIle Arg 140 145 150 Asn Asp Ser Ser Ser Ser Val Ala Val Ser Thr Cys AlaGly Leu 155 160 165 Ser Gly Leu Ile Arg Thr Arg Lys Asn Glu Phe Leu IleSer Pro 170 175 180 Leu Pro Gln Leu Leu Ala Gln Glu His Asn Tyr Ser SerPro Ala 185 190 195 Gly His His Pro His Val Leu Tyr Lys Arg Thr Ala GluGlu Lys 200 205 210 Ile Gln Arg Tyr Arg Gly Tyr Pro Gly Ser Gly Arg AsnTyr Pro 215 220 225 Gly Tyr Ser Pro Ser His Ile Pro His Ala Ser Gln SerArg Glu 230 235 240 Thr Glu Tyr His His Arg Arg Leu Gln Lys Gln His PheCys Gly 245 250 255 Arg Arg Lys Lys Tyr Ala Pro Lys Pro Pro Thr Glu AspThr Tyr 260 265 270 Leu Arg Phe Asp Glu Tyr Gly Ser Ser Gly Arg Pro ArgArg Ser 275 280 285 Ala Gly Lys Ser Gln Lys Gly Leu Asn Val Glu Thr LeuVal Val 290 295 300 Ala Asp Lys Lys Met Val Glu Lys His Gly Lys Gly AsnVal Thr 305 310 315 Thr Tyr Ile Leu Thr Val Met Asn Met Val Ser Gly LeuPhe Lys 320 325 330 Asp Gly Thr Ile Gly Ser Asp Ile Asn Val Val Val ValSer Leu 335 340 345 Ile Leu Leu Glu Gln Glu Pro Gly Gly Leu Leu Ile AsnHis His 350 355 360 Ala Asp Gln Ser Leu Asn Ser Phe Cys Gln Trp Gln SerAla Leu 365 370 375 Ile Gly Lys Asn Gly Lys Arg His Asp His Ala Ile LeuLeu Thr 380 385 390 Gly Phe Asp Ile Cys Ser Trp Lys Asn Glu Pro Cys AspThr Leu 395 400 405 Gly Phe Ala Pro Ile Ser Gly Met Cys Ser Lys Tyr ArgSer Cys 410 415 420 Thr Ile Asn Glu Asp Thr Gly Leu Gly Leu Ala Phe ThrIle Ala 425 430 435 His Glu Ser Gly His Asn Phe Gly Met Ile His Asp GlyGlu Gly 440 445 450 Asn Pro Cys Arg Lys Ala Glu Gly Asn Ile Met Ser ProThr Leu 455 460 465 Thr Gly Asn Asn Gly Val Phe Ser Trp Ser Ser Cys SerArg Gln 470 475 480 Tyr Leu Lys Lys Phe Leu Ser Thr Pro Gln Ala Gly CysLeu Val 485 490 495 Asp Glu Pro Lys Gln Ala Gly Gln Tyr Lys Tyr Pro AspLys Leu 500 505 510 Pro Gly Gln Ile Tyr Asp Ala Asp Thr Gln Cys Lys TrpGln Phe 515 520 525 Gly Ala Lys Ala Lys Leu Cys Ser Leu Gly Phe Val LysAsp Ile 530 535 540 Cys Lys Ser Leu Trp Cys His Arg Val Gly His Arg CysGlu Thr 545 550 555 Lys Phe Met Pro Ala Ala Glu Gly Thr Val Cys Gly LeuSer Met 560 565 570 Trp Cys Arg Gln Gly Gln Cys Val Lys Phe Gly Glu LeuGly Pro 575 580 585 Arg Pro Ile His Gly Gln Trp Ser Ala Trp Ser Lys TrpSer Glu 590 595 600 Cys Ser Arg Thr Cys Gly Gly Gly Val Lys Phe Gln GluArg His 605 610 615 Cys Asn Asn Pro Lys Pro Gln Tyr Gly Gly Ile Phe CysPro Gly 620 625 630 Ser Ser Arg Ile Tyr Gln Leu Cys Asn Ile Asn Pro CysAsn Glu 635 640 645 Asn Ser Leu Asp Phe Arg Ala Gln Gln Cys Ala Glu TyrAsn Ser 650 655 660 Lys Pro Phe Arg Gly Trp Phe Tyr Gln Trp Lys Pro TyrThr Lys 665 670 675 Val Glu Glu Glu Asp Arg Cys Lys Leu Tyr Cys Lys AlaGlu Asn 680 685 690 Phe Glu Phe Phe Phe Ala Met Ser Gly Lys Val Lys AspGly Thr 695 700 705 Pro Cys Ser Pro Asn Lys Asn Asp Val Cys Ile Asp GlyVal Cys 710 715 720 Glu Leu Val Gly Cys Asp His Glu Leu Gly Ser Lys AlaVal Ser 725 730 735 Asp Ala Cys Gly Val Cys Lys Gly Asp Asn Ser Thr CysLys Phe 740 745 750 Tyr Lys Gly Leu Tyr Leu Asn Gln His Lys Ala Asn GluTyr Tyr 755 760 765 Pro Val Val Ile Ile Pro Ala Gly Ala Arg Ser Ile GluIle Gln 770 775 780 Glu Leu Gln Val Ser Ser Ser Tyr Leu Ala Val Arg SerLeu Ser 785 790 795 Gln Lys Tyr Tyr Leu Thr Gly Gly Trp Ser Ile Asp TrpPro Gly 800 805 810 Glu Phe Pro Phe Ala Gly Thr Thr Phe Glu Tyr Gln ArgSer Phe 815 820 825 Asn Arg Pro Glu Arg Leu Tyr Ala Pro Gly Pro Thr AsnGlu Thr 830 835 840 Leu Val Phe Glu Ile Leu Met Gln Gly Lys Asn Pro GlyIle Ala 845 850 855 Trp Lys Tyr Ala Leu Pro Lys Val Met Asn Gly Thr ProPro Ala 860 865 870 Thr Lys Arg Pro Ala Tyr Thr Trp Ser Ile Val Gln SerGlu Cys 875 880 885 Ser Val Ser Cys Gly Gly Gly Tyr Ile Asn Val Lys AlaIle Cys 890 895 900 Leu Arg Asp Gln Asn Thr Gln Val Asn Ser Ser Phe CysSer Ala 905 910 915 Lys Thr Lys Pro Val Thr Glu Pro Lys Ile Cys Asn AlaPhe Ser 920 925 930 Cys Pro Ala Tyr Trp Met Pro Gly Glu Trp Ser Thr CysSer Lys 935 940 945 Ala Cys Ala Gly Gly Gln Gln Ser Arg Lys Ile Gln CysVal Gln 950 955 960 Lys Lys Pro Phe Gln Lys Glu Glu Ala Val Leu His SerLeu Cys 965 970 975 Pro Val Ser Thr Pro Thr Gln Val Gln Ala Cys Asn SerHis Ala 980 985 990 Cys Pro Pro Gln Trp Ser Leu Gly Pro Trp Ser Gln CysSer Lys 995 1000 1005 Thr Cys Gly Arg Gly Val Arg Lys Arg Glu Leu LeuCys Lys Gly 1010 1015 1020 Ser Ala Ala Glu Thr Leu Pro Glu Ser Gln CysThr Ser Leu Pro 1025 1030 1035 Arg Pro Glu Leu Gln Glu Gly Cys Val LeuGly Arg Cys Pro Lys 1040 1045 1050 Asn Ser Arg Leu Gln Trp Val Ala SerSer Trp Ser Glu Val Leu 1055 1060 1065 Ile Arg Ser His Cys Trp Val ArgArg Leu Arg Pro Ser Trp Leu 1070 1075 1080 Thr Gln 18 1187 DNA Homosapiens misc_feature Incyte ID No 6930294CB1 18 ccgcaacctt gaagcggcatccgtggagtg cgcctgcgca gctacgaccg cagcaggaaa 60 gcgccgccgg ccaggcccagctgtggccgg acagggactg gaagagagga cgcggtcgag 120 taggtgtgca ccagccctggcaacgagagc gtctaccccg aactctgctg gccttgaggt 180 tttaaaacat gaatcctacactcatccttg ctgccttttg cctgggaatt gcctcagcta 240 ctctaacatt tgatcacagtttagaggcac agtggaccaa gtggaaggcg atgcacaaca 300 gattatacgg catgaatgaagaaggatgga ggagagcagt gtgggagaag aacatgaaga 360 tgattgaact gcacaatcaggaatacaggg aagggaaaca cagcttcaca atggccatga 420 acgcctttgg agacatgaccagtgaagaat tcaggcaggt gatgaatggc tttcaaaacc 480 gtaagcccag gaaggggaaagtgttccagg aacctctgtt ttatgaggcc cccagatctg 540 tggattggag agagaaaggctacgtgactc ctgtgaagaa tcagggtcag tgtggttctt 600 gttgggcttt tagtgctactggtgctcttg aaggacagat gttccggaaa actgggaggc 660 ttatctcact gagtgagcagaatctggtag actgctctgg gcctcaaggc aatgaaggct 720 gcaatggtgg cctaatggattatgctttcc agtatgttca ggatactgga ggcctggact 780 ctgaggaatc ctatccatatgaggcaacag aagaatcctg taggtacaat cccaagtatt 840 ctgctgctaa tgacactggctttgtggaca tcccttcaca ggagaaggac ctggcgaagg 900 cagtggcaac tgtggggcccatctctgttg ctgctggtgc aagccatgtc tccttccagt 960 tctataaaaa aggtatttattttgagccac gctgtgaccc cgaaggtctg gatcatgcta 1020 tgctgctggt tggctacagctatgaaggag cagactcaga taacaataaa tattggctgg 1080 tgaagaacag gtatggtaaaaactggggca tggatggcta cataaagatg gccaaagacc 1140 agaggaacaa ctgtggaattgccacagcag ccagctaccc cactgtg 1187 19 461 DNA Homo sapiens misc_featureIncyte ID No 7473018CB1 19 aagaattcgg cacgagggcc atggctgacc aactcttgcgtaaaaagaga agaattttta 60 tccattcagt gggtgcaggc acaataaatg ccttgctggattgcctatta gaggatgaag 120 ttattagcca ggaagacatg aacaaagtga gagatgaaaatgacactgtc atggataagg 180 ctcgagtctt gattgacctt gttactggaa aaggacccaagtcttgctgc aaatttatca 240 agcatctctg tgaagaagac cctcaacttg cctcaaagatgggtttgcac taagagagaa 300 gatggaactc tggagcactt cagagacttc ccagagcttcttccaaggga gaagatattc 360 tcgtgaaaga aaaaaacaaa acaaaacaac agtgcttttttcaaacctga ttaatttcat 420 caatttccaa taaatctttc attctctcaa aaaaaaaaaa a461 20 1884 DNA Homo sapiens misc_feature Incyte ID No 7479221CB1 20atgtcccagc tctcctccac cctgaagcgc tacacagaat cggcccgcta cacagatgcc 60cactatgcca agtcgggcta tggtgcctac accccatcct cctatggggc caatctggct 120gcctccttac tggagaagga gaaacttggt ttcaagccgg tccccaccag cagcttcctc 180acccgtcccc gtacctatgg cccctcctcc ctcctggact atgaccgggg ccgccccctg 240ctgagacccg acatcactgg gggtggtaag cgggcagaga gccagacccg gggtactgag 300cggcctttag gcagtggcct cagcgggggc agcggattcc cttatggagt gaccaacaac 360tgcctcagct acctgcccat caatgcctat gaccaggggg tgaccctaac ccagaagctg 420gacagccaat cagacctggc ccgggatttc tccagcctcc ggacctcaga tagctaccgg 480atagacccca ggaacctggg ccgcagcccc atgctggccc ggacgcgcaa ggagctctgc 540accctgcagg ggctctacca gacagccagc tgccctgaat acctggtcga ctacctggag 600aactatggtc gcaagggcag tgcatctcag gtgccctccc aggcccctcc ctcacgagtc 660cctgaaatca tcagcccaac ctaccgaccc attggccgct acacgctgtg ggagacggga 720aagggtcagg cccctgggcc cagccgctcc agctccccgg gaagagacgg catgaattct 780aagagtgccc agggtctggc tggtcttcga aaccttggga acacgtgctt catgaactca 840attctgcagt gcctgagcaa cactcgggag ttgagagatt actgcctcca gaggctctac 900atgcgggacc tgcaccacgg cagcaatgca cacacagccc tcgtggaaga gtttgcaaaa 960ctaattcaga ccatatggac ttcatccccc aatgatgtgg tgagcccatc tgagttcaag 1020acccagatcc agagatatgc accgcgcttt gttggctata atcagcagga tgctcaggag 1080ttccttcgct ttcttctgga tgggctccat aacgaggtga accgagtgac actgagacct 1140aagtccaacc ctgagaacct cgatcatctt cctgatgacg agaaaggccg acagatgtgg 1200agaaaatatc tagaacggga agacagtagg atcggggatc tctttgttgg gcagctaaag 1260agctcgctga cgtgtacaga ttgtggttac tgttctacgg tcttcgaccc cttctgggac 1320ctctcactgc ccattgctaa gcgaggttat cctgaggtga cattaatgga ctgcatgagg 1380ctcttcacca aagaggatgt gcttgatgga gatgaaaagc caacatgctg tcgctgccga 1440ggcagaaaac ggtgtataaa gaagttctcc atccagaggt tcccaaagat cttggtgctc 1500catctgaagc ggttctcaga atccaggatc cgaaccagca agctcacaac atttgtgaac 1560ttccccctaa gagacctgga cttaagagaa tttgcctcag aaaacaccaa ccatgctgtt 1620tacaacctgt acgctgtgtc caatcactcc ggaaccacca tgggtggcca ctatacagcc 1680tactgtcgca gtccagggac aggagaatgg cacactttca acgactccag cgtcactccc 1740atgtcctcca gccaagtgcg caccagcgac gcctacctgc tcttctacga actggccagc 1800ccgccctccc gaatgtagcg ccaggagcca cgtcccttct cccttccccg tggtggcccc 1860gctccctaaa ttttttaaaa aaac 1884 21 2576 DNA Homo sapiens misc_featureIncyte ID No 2923874CB1 21 tagacaaaag aaaaatccaa agggaaaatg ctgatctctggaatactttg gacattcatg 60 catcaaaagc caactgcaag ccactatttg caagtcaagtctcaagatgg tatactatct 120 ccaggaaaag gcttggaaga tacagatgtg gtgtataaaagcgagaatgg acatgtcatt 180 aaactgaata tagaaacaaa tgctaccaca ttattattggaaaacacaac ttttgtaacc 240 ttcaaagcat caagacattc agtttcacca gatttaaaatatgtccttct ggcatatgat 300 gtcaaacaga tttttcatta ttcgtatact gcttcatatgtgatttacaa catacacact 360 agggaagttt gggagttaaa tcctccagaa gtagaggactccgtcttgca gtacgcggcc 420 tggggtgtcc aagggcagca gctgatttat atttttgaaaataatatcta ctatcaacct 480 gatataaaga gcagttcatt gcgactgaca tcttctggaaaagaagaaat aatttttaat 540 gggattgctg actggttata tgaagaggaa ctcctgcattctcacatcgc ccactggtgg 600 tcaccagatg gagaaagact tgccttcctg atgataaatgactctttggt acccaccatg 660 gttatccctc ggtttactgg agcgttgtat cccaaaggaaagcagtatcc gtatcctaag 720 gcaggtcaag tgaacccaac aataaaatta tatgttgtaaacctgtatgg accaactcac 780 actttggagc tcatgccacc tgacagcttt aaatcaagagaatactatat cactatggtt 840 aaatgggtaa gcaataccaa gactgtggta agatggttaaaccgacctca gaacatctcc 900 atcctcacag tctgtgagac cactacaggt gcttgtagtaaaaaatatga gatgacatca 960 gatacgtggc tctctcagca gaatgaggag cccgtgttttctagagacgg cagcaaattc 1020 tttatgacag tgcctgttaa gcaaggggga cgtggagaatttcaccacat agctatgttc 1080 ctcatccaga gtaaaagtga gcaaattacc gtgcggcatctgacatcagg aaactgggaa 1140 gtgataaaga tcttggcata cgatgaaact actcaaaaaatttactttct gagcactgaa 1200 tcttctccca gaggaaggca gctgtacagt gcttctactgaaggattatt gaatcgccaa 1260 tgcatttcat gtaatttcat gaaagaacaa tgtacatattttgatgccag ttttagtccc 1320 atgaatcaac atttcttatt attctgtgaa ggtccaagggtcccagtggt cagcctacat 1380 agtacggaca acccagcaaa atattttata ttggaaagcaattctatgct gaaggaagct 1440 atcctgaaga agaagatagg aaagccagaa attaaaatccttcatattga cgactatgaa 1500 cttcctttac agttgtccct tcccaaagat tttatggaccgaaaccagta tgctcttctg 1560 ttaataatgg atgaagaacc aggaggccag ctggttacagataagttcca tattgactgg 1620 gattccgtac tcattgacat ggataatgtc attgtagcaagatttgatgg cagaggaagt 1680 ggattccagg gtctgaaaat tttgcaggag attcatcgaagattaggttc agtagaagta 1740 aaggaccaaa taacagctgt gaaatttttg ctgaaactgccttacattga ctccaaaaga 1800 ttaagcattt ttggaaaggg ttatggtggc tatattgcatcaatgatctt aaaatcagat 1860 gaaaagcttt ttaaatgtgg atccgtggtt gcacctatcacagacttgaa attgtatgcc 1920 tcagctttct ctgaaagata ccttgggatg ccatctaaggaagaaagcac ttaccaggca 1980 gccagtgtgc tacataatgt tcatggcttg aaagaagaaaatatattaat aattcatgga 2040 actgctgaca caaaagttca tttccaacac tcagcagaattaatcaagca cctaataaaa 2100 gctggagtga attatactat gcaggtctac ccagatgaaggtcataacgt atctgagaag 2160 agcaagtatc atctctacag cacaatcctc aaattcttcagtgattgttt gaaggaagaa 2220 atatctgtgc taccacagga accagaagaa gatgaataatggaccgtatt tatacagaac 2280 tgaagggaat attgaggctc aatgaaacct gacaagagactgtaatattg tagttgctcc 2340 agaatgtcaa gggcagctta cggagatgtc actggagcagcacgctcaga gacagtgaac 2400 tagcatttga atacacaagt ccaagtctac tgtgttgctaggggtgcaga acccgtttct 2460 ttgtatgaga gaggtcaagg gttggtttcc tgggagaaattagttttgca ttaaagtagg 2520 agtagtgcat gttttcttct gttatccccc tgtttgttctgtaactagtt ctctca 2576 22 2000 DNA Homo sapiens misc_feature Incyte IDNo 55122335CB1 22 gctctgcggc catggcgagc ggcgagcatt cccccggcag cggcgcggcccggcggccgc 60 tgcactccgc gcaggctgtg gacgtggcct cggcctccaa cttccgggcctttgagctgc 120 tgcacttgca cctggacctg cgggctgagt tcgggcctcc agggcccggcgcagggagcc 180 gggggctgag cggcaccgcg gtcctggacc tgcgctgcct ggagcccgagggcgccgccg 240 agctgcggct ggactcgcac ccgtgcctgg aggtgacggc ggcggcgctgcggcgggagc 300 ggcccggctc ggaggagccg cctgcggagc ccgtgagctt ctacacgcagcccttctcgc 360 actatggcca ggccctgtgc gtgtccttcc cgcagccctg ccgcgccgccgagcgcctcc 420 aggtgctgct cacctaccgc gtcggggagg gacccggggt ttgctggttggctcccgagc 480 agacagcagg aaagaagaag cccttcgtgt acacccaggg ccaggctgtcctaaaccggg 540 ccttcttccc ttgcttcgac acgcctgctg ttaaatacaa gtattcagctcttattgagg 600 tcccagatgg cttcacagct gtgatgagtg ctagcacctg ggagaagagaggtccaaata 660 agttcttctt ccagatgtgt cagcccatcc cctcctatct gatagctttggccatcggag 720 atctggtttc ggctgaagtt ggacccagga gccgggtgtg ggctgagccctgcctgattg 780 atgctgccaa ggaggagtac aacggggtga tagaagaatt tttggcaacaggagagaagc 840 tttttggacc ttatgtttgg ggaaggtatg acttgctctt catgccaccgtcctttccat 900 ttggaggaat ggagaaccct tgtctgacct ttgtcacccc ctgcctgctagctggggacc 960 gctccttggc agatgtcatc atccatgaga tctcccacag ttggtttgggaacctggtca 1020 ccaacgccaa ctggggtgaa ttctggctca atgaaggttt caccatgtacgcccagagga 1080 ggatctccac catcctcttt ggcgctgcgt acacctgctt ggaggctgcaacggggcggg 1140 ctctgctgcg tcagcacatg gacatcactg gagaggaaaa cccactcaacaagctccgcg 1200 tgaagattga accaggcgtt gacccggacg acacctataa tgagaccccctacgagaaag 1260 gtttctgctt tgtctcatac ctggcccact tggtgggtga tcaggatcagtttgacagtt 1320 ttctcaaggc ctatgtgcat gaattcaaat tccgaagcat cttagccgatgactttctgg 1380 acttctactt ggaatatttc cctgagctta agaaaaagag agtggatatcattccaggtt 1440 ttgagtttga tcgatggctg aatacccccg gctggccccc gtacctccctgatctctccc 1500 ctggggactc actcatgaag cctgctgaag agctagccca actgtgggcagccgaggagc 1560 tggacatgaa ggccattgaa gccgtggcca tctctccctg gaagacctaccagctggtct 1620 acttcctgga taagatcctc cagaaatccc ctctccctcc tgggaatgtgaaaaaacttg 1680 gagacacata cccaagtatc tcaaatgccc ggaatgcaga gctccggctgcgatggggcc 1740 aaatcatcct taagaacgac caccaggaag atttctggaa agtgaaggagttcctgcata 1800 accaggggaa gcagaagtat acacttccgc tgtaccacgc aatgatgggtggcagtgagg 1860 tggcccagac cctcgccaag gagacttttg catccaccgc ctcccagctccacagcaatg 1920 ttgtcaacta tgtccagcag atcgtggcac ccaagggcag ttagaggctcgtgtgcatgg 1980 cccctgcctc ttcaggctct 2000 23 3522 DNA Homo sapiensmisc_feature Incyte ID No 7473550CB1 23 ggagggagga cgtgcgaggc cggtgcgtggaggctatggg cctgctggcc agtgctggtt 60 tgttgctgtt gctggtcatc ggccaccccagaagcctagg actgaagtgt ggaattcgca 120 tggtcaacat gaaaagtaag gaacctgccgtgggatctag attcttctct agaattagta 180 gttggagaaa ttcaacagtg actggacatccatggcaggt ctacctaaaa tcagatgagc 240 accacttctg tggaggaagc ttgattcaagaagatcgggt tgttacagca gcacactgcc 300 tgcacagcct cagtgagaag cagctgaagaatataactgt gacttctggg gagtacagcc 360 tctttcagaa ggataagcaa gaacagaatattcctgtctc aaaaattatt acccatcctg 420 aatacaacag ccgtgaatat atgagtcctgatattgcact gctgtatcta aaacacaaag 480 tcaagtttgg aaatgctgtt cagccaatctgtcttcctga cagcgatgat aaagttgaac 540 caggaattct ttgcttatcc agtggatggggcaagatttc caaaacatca gaatattcaa 600 atgtcctaca agaaatggaa cttcccatcatggatgacag agcgtgtaat actgtgctca 660 agagcatgaa cctccctccc ctgggaaggaccatgctgtg tgctggcttc cctgattggg 720 gaatggacgc ctgccagggg gactctggaggaccactggt ttgtagaaga ggtggtggaa 780 tctggattct tgctgggata acttcctgggtagctggttg tgctggaggt tcagttcccg 840 taagaaacaa ccatgtgaag gcatcacttggcattttctc caaagtgtct gagttgatgg 900 attttatcac tcaaaacctg ttcacaggtttggatcgggg ccaacccctc tcaaaagtgg 960 gctcaaggta tataacaaag gccctgagttctgtccaaga agtgaatgga agccagagag 1020 ataaaataat cctgataaaa tttacaagtttagacatgga aaagcaagtt ggatgtgatc 1080 atgactatgt atctttacga tcaagcagtggagtgctttt tagtaaggtc tgtggaaaaa 1140 tattgccttc accattgctg gcagagaccagtgaggccat ggttccattt gtttctgata 1200 cagaagacag tggcagtggc tttgagcttaccgttactgc tgtacagaag tcagaagcag 1260 ggtcaggttg tgggagtctg gctatattggtagaagaagg gacaaatcac tctgccaagt 1320 atcctgattt gtatcccagt aacacaaggtgtcattggtt catttgtgct ccagagaagc 1380 acattataaa gttgacattt gaggactttgctgtcaaatt tagtccaaac tgtatttatg 1440 atgctgttgt gatttacggt gattctgaagaaaagcacaa gttagctaaa ctttgtggaa 1500 tgttgaccat cacttcaata ttcagttctagtaacatgac ggtgatatac tttaaaagtg 1560 atggtaaaaa tcgtttacaa ggcttcaaggccagatttac cattttgccc tcagagtctt 1620 taaacaaatt tgaaccaaag ttacctccccaaaacaatcc tgtatctacc gtaaaagcta 1680 ttctgcatga tgtctgtggc atccctccatttagtcccca gtggctttcc agaagaatcg 1740 caggagggga agaagcctgc ccccactgttggccatggca ggtgggtctg aggtttctag 1800 gcgattacca atgtggaggt gccatcatcaacccagtgtg gattctgacc gcagcccact 1860 gtgtgcaatt gaagaataat ccactctcctggactattat tgctggggac catgacagaa 1920 acctgaagga atcaacagag caggtgagaagggccaaaca cataatagtg catgaagact 1980 ttaacacact aagttatgac tctgacattgccctaataca actaagctct cctctggagt 2040 acaactcggt ggtgaggcca gtatgtctcccacacagcgc agagcctcta ttttcctcgg 2100 agatctgtgc tgtgaccgga tggggaagcatcagtgcaga gctctctctg aatgtttctt 2160 cattagatgg tggcctagca agtcgcctacagcagattca agtgcatgtg ttagaaagag 2220 aggtctgtga acacacttac tattctgcccatccaggagg gatcacagag aagatgatct 2280 gtgctggctt tgcagcatct ggagagaaagatttctgcca gggagactct ggtgggccac 2340 tagtatgtag acatgaaaat ggtccctttgtcctctatgg cattgtcagc tggggagctg 2400 gctgtgtcca gccatggaag ccgggtgtatttgccagagt gatgatcttc ttggactgga 2460 tccaatcaaa aatcaatggt aaattgttttcaaatgttat taaaacaata acctctttct 2520 ttagagtggg tttgggaaca gtgagttgttgctctgaagc agagctagaa aagcctagag 2580 gcttttttcc cacaccacgg tatctactggattatagagg aagactggaa tgttcttggg 2640 tgctcagagt ttcagcaagc agtatggcaaaatttaccat tgagtatctg tcactcctgg 2700 ggtctcctgt gtgtcaagac tcagttctaattatttatga agaaagacac agtaagagaa 2760 agacggcagg taatccttcc tggcatttgccaatggagat tagtagcccc tttaaatcac 2820 atcattcagc ttaatattat taacttcccgatgaagccaa caacttttgt ctgtcatggt 2880 catctgcgtg tttacgaagg atttggaccaggaaaaaaat taataggttt ctcaaggatg 2940 tccagtattg gatttgattc cagtgacttctgttgagatc acatctcttg attatcctaa 3000 cagttaactc aacatgctga attacacttggactttttat tccacaacag taaataaaat 3060 gaaggccatg attaaagact ttataacagaagaatctttg aactgtgatt gggattatat 3120 aatatttatg atagaccata tcagcaatcaattctacttt ttatgtcatt gcacagaaaa 3180 agtactttaa ttatcgaata tgtctttcattgcagctcat ttacatggct ccaagaaaga 3240 gtttatcttg atatccagtg ctgcttacctgactgtgcat tttaagactg atgagtctaa 3300 ggttggttgg tagctgtgct gcatctgcaatgtcatggcc ctggctagtt agtctgcagc 3360 acgggagatt ctggaagacc accgcaatgtgcccggcatg ggaagtacaa gcttgttgac 3420 attgtgagct gaggcagcag tcactgccatcccactgcac caccttcaca agaatttctg 3480 cctgcaggga ttggatcaca tctgccactagaggagaagt ga 3522 24 3277 DNA Homo sapiens misc_feature Incyte ID No7478108CB1 24 ccggtccctg ccatggggcc cccttccagc tcaggcttct atgtgagccgcgcagtggcc 60 ctgctgctgg ctgggttggt agccgccctc ctgctggcgc tggccgtactcgccgccttg 120 tacggccact gcgagcgcgt cccaccgtcg gagctgcctg gactcagggactcggaagcc 180 gagtcttccc ctcccctcag gcagaagccg acgccgaccc cgaaacccagcagtgcacgc 240 gagctagcgg tgacgaccac cccgagcaac tggcgacccc cggggccctgggaccagcta 300 cgcctgccgc cctggctcgt gccgctgcac tacgatctgg agctgtggccgcagctgagg 360 cccgacgagc ttccggccgg gtctttgccc ttcactggcc gcgtgaacatcacggtgcgc 420 tgcacggtgg ccacctctcg actgctgctg catagcctct tccaggactgcgagcgcgcc 480 gaggtgcggg gacccctttc cccgggcact gggaacgcca cagtgggccgcgtgcccgtg 540 gacgacgtgt ggttcgcgct ggacacggaa tacatggtgc tggagctcagtgagcccctg 600 aaacctggta gcagctacga gctgcagctt agcttctcgg gcctggtgaaggaagacctc 660 agggagggac tcttcctcaa cgtctacacc gaccagggcg agcgcagggccctgttagcg 720 tcccagctgg aaccaacatt tgccaggtat gttttccctt gttttgatgagccagctctg 780 aaggcaactt ttaatattac aatgattcat catccaagtt atgtggccctttccaacatg 840 ccaaagctag gtcagtctga aaaagaagat gtgaatggaa gcaaatggactgttacaacc 900 ttttccacta cgccccacat gccaacttac ttagtcgcat ttgttatatgtgactatgac 960 cacgtcaaca gaacagaaag gggcaaggag atacgcatct gggcccggaaagatgcaatt 1020 gcaaatggaa gtgcagactt tgctttgaac atcacaggtc ccatcttctcttttctggag 1080 gatttgttta atatcagtta ctctcttcca aaaacagata taattgccttgcctagtttt 1140 gacaaccatg caatggaaaa ctggggacta atgatatttg atgaatcaggattgttgttg 1200 gaaccaaaag atcaactgac agaaaaaaag actctgatct cctatgttgtctcccacgag 1260 attggacacc agtggtttgg aaacttggtt accatgaatt ggtggaacaatatctggctc 1320 aacgagggtt ttgcatctta ttttgagttt gaagtaatta actactttaatcctaaactc 1380 ccaagaaatg agatcttttt ttctaacatt ttacataata tcctcagagaagatcacgcc 1440 ctggtgacta gagctgtggc catgaaggtg gaaaatttca aaacaagtgaaatacaggaa 1500 ctctttgaca tatttactta cagcaaggga gcgtctatgg cccggatgctttcttgtttc 1560 ttgaatgagc atttatttgt cagtgcactc aagtcatatt tgaagacattttcctactca 1620 aacgctgagc aagatgatct atggaggcat tttcaaatgg ccatagatgaccagagtaca 1680 gttattttgc cagcaacaat aaaaaacata atggacagtt ggacacaccagagtggtttt 1740 ccagtgatca ctttaaacgt gtctactggc gtcatgaaac aggagccattttatcttgaa 1800 aacattaaaa atcggactct tctaaccagc aatgacacat ggattgtccctattctttgg 1860 ataaaaaatg gaactacaca acctttagtc tggctagatc aaagcagcaaagtattccca 1920 gaaatgcaag tttcagattc tgaccatgac tgggtgattt tgaatttgaatatgactgga 1980 tattatagag ttaattatga taaattaggt tggaagaaac taaatcaacaacttgaaaag 2040 gatcctaagg ctattcctgt tattcacaga ctgcagttca ttgatgatgccttttccttg 2100 tctaaaaaca attatattga gattgaaaca gcacttgagt taaccaagtaccttgctgaa 2160 gaagatgaaa ttatagtatg gcatacagtc ttggtaaact tggtaaccagggatcttgtt 2220 tctgaggtga acatctatga tatatactca ttattaaaga ggtacctattaaagagactt 2280 aatttaatat ggaatattta ttcaactata attcgtgaaa atgtgttggcattacaagat 2340 gactacttag ctctaatatc actggaaaaa ctttttgtaa ctgcgtgttggttgggcctt 2400 gaagactgcc ttcagctgtc aaaagaactt ttcgcaaaat gggtggatcatccagaaaat 2460 gaaatacctt atccaattaa agatgtggtt ttatgttatg gcattgccttgggaagtgat 2520 aaagagtggg acatcttgtt aaatacttac actaatacaa caaacaaagaagaaaagatt 2580 caacttgctt atgcaatgag ctgcagcaaa gacccatgga tacttaacagatatatggag 2640 tatgccatca gcacatctcc attcacttct aatgaaacaa atataattgaggttgtggct 2700 tcatctgaag ttggccggta tgtcgcaaaa gacttcttag tcaacaactggcaagctgtg 2760 agtaaaaggt atggaacaca atcattgatt aatctaatat atacaatagggagaaccgta 2820 actacagatt tacagattgt ggagctgcag cagtttttca gtaacatgttggaggaacac 2880 cagaggatca gagttcatgc caacttacag acaataaaga atgaaaatctgaaaaacaag 2940 aagctaagtg ccaggatagc tgcgtggcta aggagaaaca catagcttgtggctatcttt 3000 cagcactcct cttgcatatt ataatgtagt ttgttcacag ttttgtcttccaatactttg 3060 tgagtctgga aaaccacaca ttttatttgt atttcagtca catttattactcagagtgcc 3120 attcttctca tattgtcatg tttggccctg agggtgggtg attgctgacaattttgccaa 3180 tgctgctgta tttctgggaa agatgtcact tcatgttggg ttataatcccacagaattta 3240 ctttaaatgt cacgtaaaaa caaattcaaa aaaaaaa 3277 25 1254DNA Homo sapiens misc_feature Incyte ID No 7482021CB1 25 atgcgcacctcgtacaccgt gaccctgccc gaggaccccc ccgccgcccc ctttcccgcc 60 ctcgccaaggagctgcggcc gcgctcccct ctctccccgt ccctgctgct ctccaccttc 120 gtggggctcctgctcaacaa agccaagaat tctaagagtg cccagggtct ggctggtctt 180 cgaaaccttgggaacacgtg cttcatgaac tcaattctgc agtgcctgag caacactcgg 240 gagttgagagattactgcct ccagaggctc tacatgcggg acctgcacca cggcagcaat 300 gcacacacagccctcgtgga agagtttgca aaactaattc agaccatatg gacttcatcc 360 cccaatgatgtggtgagccc atctgagttc aagacccaga tccagagata tgcaccgcgc 420 tttgttggctataatcagca ggatgctcag gagttccttc gctttcttct ggatgggctc 480 cataacgaggtgaaccgagt gacactgaga cctaagtcca accctgagaa cctcgatcat 540 cttcctgatgacgagaaagg ccgacagatg tggagaaaat atctagaacg ggaagacagt 600 aggatcggggatctctttgt tgggcagcta aagagctcgc tgacgtgtac agattgtggt 660 tactgttctacggtcttcga ccccttctgg gacctctcac tgcccattgc taagcgaggt 720 tatcctgaggtgacattaat ggactgcatg aggctcttca ccaaagagga tgtgcttgat 780 ggagatgaaaagccaacatg ctgtcgctgc cgaggcagaa aacggtgtat aaagaagttc 840 tccatccagaggttcccaaa gatcttggtg ctccatctga agcggttctc agaatccagg 900 atccgaaccagcaagctcac aacatttgtg aacttccccc taagagacct ggacttaaga 960 gaatttgcctcagaaaacac caaccatgct gtttacaacc tgtacgctgt gtccaatcac 1020 tccggaaccaccatgggtgg ccactataca gcctactgtc gcagtccagg gacaggagaa 1080 tggcacactttcaacgactc cagcgtcact cccatgtcct ccagccaagt gcgcaccagc 1140 gacgcctacctgctcttcta cgaactggcc agcccgccct cccgaatgta gcgccaggag 1200 ccacgtcccttctcccttcc ccgtggtggc cccgctccct aaatttttta aaaa 1254 26 1120 DNA Homosapiens misc_feature Incyte ID No 7482145CB1 26 cgcgtgtgga agcgcttccgggcggtagca cgctgtgttg gcggcggctc cccgcttgcc 60 tcagctgcag cagcgggaagctcggtggca agcccttgta gtcctgtgcg atggcgtctc 120 gatatgacag ggcgatcactgtcttctccc cagacggaca cctttttcaa gttgaatatg 180 cccaggaagc ggtgaagaaaggatccaccg cggtcggaat tcgaggtacc aatatagttg 240 ttcttggggt agaaaaaaaatctgttgcca agcttcaaga tgaaagaact gtgaggaaaa 300 tttgtgccct tgatgaccatgtctgcatgg cttttgcagg acttactgct gatgctagag 360 tagtaataaa cagagcccgtgtggagtgcc agagccataa gcttacggtt gaggacccag 420 tcactgtaga atacataactcgcttcatag caactttaaa gcagaaatat acccaaagca 480 atggacgaag accttttggtatttctgcct taattgtagg ttttgatgat gatggtatct 540 caagattgta tcagacagatccttctggta cttatcatgc ttggaaggca aatgcaatag 600 gccgaagtgc taaaactgttcgagaatttc tagaaaagaa ttacacagaa gatgccatag 660 caagtgacag tgaagctatcaagttagcaa taaaagcttt gctagaagtt gtccagtctg 720 gtggaaaaaa cattgaacttgctataataa gaagaaatca acctttgaag atgtttagtg 780 caaaagaagt tgaattatatgtaactgaaa tagaaaagga aaaggaagaa gcagagaaga 840 aaaaatcaaa gaaatctgtctaattcttag gatgaccact gggaggtctt aatgttttgt 900 tttattgtac tgcctgaggttgtttagtga aattttagag gaaaacagtt attttgcagc 960 attacatgca gtacttgtgtgatgttttga gaatgccaga tctgtggctg tcttcattct 1020 attacatagt caaacataggtttatgtgaa gattttcttt gaaaggggat ttcagtaatt 1080 gttgagagca gtcataattccacataagcc tgagactcta 1120 27 4577 DNA Homo sapiens misc_feature IncyteID No 55022586CB1 27 agtgatcact atagggcctg gttatctaat gctgctcgagcgcgcgcagt gtgctggaaa 60 gcgcggctgg gcgcctcggc catgactgcg gagctgcagcaggacgacgc ggccggcgcg 120 gcagacggcc acggctcgag ctgccaaatg ctgttaaatcaactgagaga aatcacaggc 180 attcaggacc cttcctttct ccatgaagct ctgaaggccagtaatggtga cattactcag 240 gcagtcagcc ttctcactga tgagagagtt aaggagcccagtcaagacac tgttgctaca 300 gaaccatctg aagtagaggg gagtgctgcc aacaaggaagtattagcaaa agttatagac 360 cttactcatg ataacaaaga tgatcttcag gctgccattgctttgagtct actggagtct 420 cccaaaattc aagctgatgg aagagatctt aacaggatgcatgaagcaac ctctgcagaa 480 actaaacgct caaagagaaa acgctgtgaa gtctggggagaaaaccccaa tcccaatgac 540 tggaggagag ttgatggttg gccagttggg ctgaaaaatgttggcaatac atgttggttt 600 agtgctgtta ttcagtctct ctttcaattg cctgaatttcgaagacttgt tctcagttat 660 agtctgccac aaaatgtact tgaaaattgt cgaagtcatacagaaaagag aaatatcatg 720 tttatgcaag agcttcagta tttgtttgct ctaatgatgggatcaaatag aaaatttgta 780 gacccgtctg cagccctgga tctattaaag ggagcattccgatcatctga ggaacagcag 840 caagatgtga gtgaattcac acacaagctc ctggattggctagaggacgc attccagcta 900 gctgttaatg ttaacagtcc caggaacaaa tctgaaaatccaatggtgca gctgttctat 960 ggtactttcc tgactgaagg ggttcgtgaa ggaaaacccttttgtaacaa tgagaccttc 1020 ggccagtatc ctcttcaggt aaacggttat cgcaacttagacgagtgttt ggaaggggcc 1080 atggtggagg gtgatgttga gcttcttccc tccgatcactcggtgaagta tggacaagag 1140 cgttggttta caaagctacc tccagtgttg acctttgaactctcaagatt tgagtttaat 1200 cagtcccttg ggcagccaga gaaaattcac aataagctggaatttcctca gattatttat 1260 atggacaggt acatgtacag gagcaaggag cttattcgaaataagagaga gtgtattcga 1320 aagttgaagg aggaaataaa aattctgcag caaaaattggaaaggtatgt gaaatatggc 1380 tcaggcccag ctcggttccc gctcccggac atgctgaaatatgttattga atttgctagt 1440 acaaaacctg cctcagaaag ctgtccacct gaaagtgacacacatatgac attaccactt 1500 tcttcagtgc actgctcggt ttctgaccag acatccaaggaaagtacaag tacagaaagc 1560 tcttctcagg atgttgaaag taccttttct tctcctgaagattctttacc caagtctaaa 1620 ccactgacat cttctcggtc ttccatggaa atgccttcacagccagctcc acgaacagtc 1680 acagatgagg agataaattt tgttaagacc tgtcttcagagatggaggag tgagattgaa 1740 caagatatac aagatttaaa gacttgtatt gcaagtactactcagactat tgaacagatg 1800 tactgcgatc ctctccttcg tcaggtgcct tatcgcttgcatgcagttct tgttcatgaa 1860 ggacaagcaa atgctggaca ctattgggcc tatatctataatcaaccccg acagagctgg 1920 ctcaagtaca atgacatctc tgttactgaa tcttcctgggaagaagttga aagagattcc 1980 tatggaggcc tgagaaatgt tagtgcttac tgtctgatgtacattaatga caaactaccc 2040 tacttcaatg cagaggcagc cccaactgaa tcagatcaaatgtcagaagt ggaagcccta 2100 tctgtggaac tcaagcatta cattcaggag gataactggcggtttgagca ggaagtagag 2160 gagtgggaag aagagcagtc ttgcaaaatc cctcaaatggagtcctccac caactcctca 2220 tcacaggact actctacatc acaagagcct tcagtagcctcttctcatgg ggttcgctgc 2280 ttgtcatctg agcatgctgt gattgtaaag gagcaaactgcccaggctat tgcaaacaca 2340 gcccgtgcct atgagaagag cggtgtagaa gcggcactgagtgaggcatt ccatgaagaa 2400 tactccaggc tctatcagct tgccaaagag acccccacctctcacagtga tcctcgactt 2460 cagcatgtcc ttgtctactt tttccaaaat gaagcacccaaaagggtagt agaacgaacc 2520 cttctggaac agtttgcaga taaaaatctt agctatgatgaaagatcaat cagcattatg 2580 aaggtggctc aagcgaaact gaaggaaatt ggtccagatgacatgaatat ggaagagtac 2640 aagaagtggc atgaagatta tagtttgttc cgaaaagtgtctgtgtatct cctaacaggc 2700 ctagaactct atcaaaaagg aaagtaccaa gaggcactttcctacctggt atatgcctac 2760 cagagcaatg ctgccctgct gatgaagggg ccccgccggggggtcaaaga atccgtgatt 2820 gctttatacc gaagaaaatg ccttctggag ctgaatgccaaagcagcttc tctttttgaa 2880 acaaatgatg atcactccgt aactgagggc attaatgtgatgaatgaact gatcatcccc 2940 tgcattcacc ttatcattaa taatgacatt tccaaggatgatctggatgc cattgaggtc 3000 atgagaaacc attggtgctc ttaccttggg caagatattgcagaaaatct gcagctgtgc 3060 ctaggggagt ttctacccag acttctagat ccttctgcagaaatcatcgt cttgaaagag 3120 cctccaacta ttcgacccaa ttctccctat gacctatgtagccgatttgc agctgtcatg 3180 gagtcaattc agggagtttc aactgtgaca gtgaaataagctcccacatg ttcaaggccc 3240 attctggttc ctggctgcct gcctcttgca cagaagttcgttgtcatagt gctcaccttg 3300 ggaaaaggat taggtgggca cataagattc cgatcagaccccaaccatgc tgcatgtgta 3360 aagaaggatt gaaaataaaa ttgcactttt taggtacaaaatcataaaag ctgtttcact 3420 agaaaaggca gaaagcagtg tattaaggtg ttgaattacgccagaagacc tgaaatgcct 3480 tgtacctaca acaatgctta ggcttttcta agcctcttgccacttttaaa attatccttc 3540 aggcataaat atttttgaca gcagaataga agaatgattcatgagaacct gaaccagatg 3600 aacagctact agttatttta tcaaatacag atgacatttaaaaattctta actacaagag 3660 attagaaata taaaccttgc ctggctcttg ccaggagataacaaaatggg ttgctgatga 3720 actgcaccct tttacatgtg ggtagaatat aagctcacatggcagtgaga tgttgaaaag 3780 tcaaaagaga cctgtctctc tcctttcttt tctatctttaaaccagaaaa cctcatactc 3840 agtcctcagt gaaagaaagt aaagtattaa ggactttagacagaagagca ttgtgtaact 3900 tgactgaaga tcatccatta atagttatta ggcatttaggtaaaattttc taatacctaa 3960 aaattgtcaa aaacagtcaa tagggctact gctggcccaaagaccattta ggtccacctc 4020 ctcttttttg ctcttttttt ttttctgtga cagtttcactgtgtcgccca ggctggcgtt 4080 cagtggtgca atctcagctc actgcaaact ctgtctcctgggctcaagtg attctcgtgc 4140 ctcagcctcc cgaatagctg gaattacggg catgcaccaccacacctggc taatttttgt 4200 atttttaata gagatggggt ttcaccatat tggccaggctgatctctaac tcctggcctc 4260 aagtgatcta tctgcctccc tcagcctccc aaagtctgggattgcagaca agtcatcgta 4320 cccggccttc ttttttgccc ttaaaagtaa gggatgtgggtttgtacaaa aaaaaaacaa 4380 aaaaaaaaag aggggcggcc gcgcgattat tgagtctcttgcaacccgcg aatttatttc 4440 cgaaccggtt acctgagggc gttcccagtt tcctaatggtgagtcgtttt acagcttgta 4500 gtaatcatga acaagctgtc ctgtgtgaat tgtttcgttccatccacata tcacacacac 4560 aacacggacg gaagacg 4577 28 1952 DNA Homosapiens misc_feature Incyte ID No 3238072CB1 28 aagtgctccc acgtggcctgcggccgctat attgaggacc acgccctgaa acactttgag 60 gagacgggac acccgctagccatggaagtc cgggatctct acgtgttctg ttacctgtgc 120 aaggactacg tgctcaatgataacccagag ggggacctga agctgctaag aagctccctc 180 ctggcggtcc ggggccagaaacaggacacg ccggtgagac gtgggcggac gctgcggtcc 240 atggcttcgg gtgaggacgtggtcctgccg cagcgcgctc ctcagggaca gccgcagatg 300 ctcacggctc tgtggtaccggcgtcagcgc ctgctggcca ggacgctgcg gctgtggttc 360 gagaagagct cccggggccaggcgaagctg gagcagcggc ggcaggagga ggccctggag 420 cgcaagaagg aggaggcgcggaggcggcgg cgcgaggtga aacggcggct gctggaggag 480 ctggccagca cccctccgcgcaagagtgca cggctgctcc tgcacacgcc ccgcgacgcg 540 ggcccggctg cctcgcgccccgccgccctc cctacctcac gcagagtgcc cgccgccaca 600 ctcaagctgc gtcgccagccggccatggcc ccaggcgtca cgggcctgcg caacctgggc 660 aacacctgct acatgaactccatcctccag gtgctcagcc acctccagaa gttccgagaa 720 tgtttcctca accttgacccttccaaaacg gaacatctgt ttcccaaagc caccaacggg 780 aagactcagc tttctggcaagccaaccaac agctcggcca cggagctgtc cttgagaaat 840 gacagggccg aggcatgcgagcgggagggc ttctgctgga acggcagggc ctccattagt 900 cggagtctgg agctcatccagaacaaggag ccgagttcaa agcacatttc cctctgccgt 960 gaactgcaca ccctcttccgagtcatgtgg tccgggaagt gggccctagt gtcgcccttc 1020 gccatgctgc actcagtgtggagcctgatc cctgccttcc gcggctacga ccaacaggac 1080 gcgcaggaat ttctctgcgagctgctgcac aaggtgcagc aggaactcga gtctgagggc 1140 accacacgcc ggatcctcatccccttctcc cagaggaagc tcaccaaaca ggtcttaaag 1200 gtggtgaata ccatatttcatgggcagctg ctcagtcagg tcacatgtat atcatgcaat 1260 tacaaatcca ataccattgagcccttttgg gacctatccc tggaattccc tgaacgctat 1320 cactgcatag aaaaggggtttgtccctttg aatcaaacag agtgcttgct cactgagatg 1380 ctggccaaat tcacagagacagaggccctg gaagggagaa tctacgcttg tgaccagtgt 1440 aacagcaaac gacgaaaatccaatcccaaa ccccttgttc tgagtgaagc tagaaagcag 1500 ttaatgatct acagactacctcaggttctc cggctgcacc ttaaaagatt caggtggtct 1560 ggccgtaatc atcgagagaagattggggtc catgtcgtct ttgaccaggt attaaccatg 1620 gaaccttact gctgcagggacatgctctcc tctcttgaca aagagacctt tgcctatgat 1680 ctctccgcag tggtcatgcatcacgggaaa gggtttggct caggacacta cacagcctat 1740 tgctacaaca cagagggaggtttttgggtc cactgcaatg actcaaagct gaatgtatgc 1800 agtgtcgagg aagtgtgcaaaacccaggcc tacatccttt tttacactca aagaacagtg 1860 cagggcaatg caagaatctcagaaacccat ctccaagctc aggtgcagtc cagcaacaat 1920 gatgaaggca gaccacagacattttcctga at 1952 29 1092 DNA Homo sapiens misc_feature Incyte ID No7482034CB1 29 aagggggctg ctccatcagc caatcccaaa gcctgaattg ggggttgaggagaaatgaag 60 cgtcagctca cacaccttcc tggccggttc tggctgtggc ccagcttctctgtagcgtcc 120 ctcctatccc accagacccc agccacaaat tcctggcttg cttcttccaaacttcattca 180 gccccaggga tggctctgca ggatgtgtgc aagtggcagt cccctgacacccagggacca 240 tcacctcacc tgcctcgggc tggcggctgg gctgtgcccc ggggttgtgaccctcaaacc 300 ttcctgcaga tccatggccc cagactggcc cacggcacca ccactctggccttccgcttc 360 cgtcatggag tcattgctgc agctgacacg cgttcctcct gtggcagctatgtggcgtgt 420 ccagcctcat gcaaggtcat ccctgtgcac cagcacctcc tgggtaccacctctggcacc 480 tctgccgact gtgctacctg gtatcgggta ttacagcggg agctgcggcttcgggaactg 540 agggagggtc agctgcccag tgtggccagt gctgccaagc tcttgtcagccatgatgtct 600 caataccggg gactggatct ctgtgtggcc actgccctct gcggctgggaccgctctggc 660 cctgagctct tctacgtcta tagcgacggc acccgcctgc agggggacatcttctctgtg 720 ggctctggat ctccctatgc ctacggcgtg ctagaccgtg gctatcgctacgacatgagc 780 acccaggaag cctacgccct ggctcgctgc gccgtggccc acgccacccaccgtgatgcc 840 tattcagggg gctctgtaga ccttttccac gtgcgggaga gtggatgggagcatgtgtca 900 cgcagtgatg cctgtgtgct gtacgtggag ttacagaagc tcctggagccggagccagag 960 gaggatgcca gccatgccca tcctgagcct gccactgccc acagagctgcagaagataga 1020 gagctctctg tggggccagg ggaggtgaca ccaggagact ccaggatgccagcagggact 1080 gagacggtgt ga 1092 30 2847 DNA Homo sapiens misc_featureIncyte ID No 7474351CB1 30 atggtcagca aggggggagt tgctgcagag ccagagccacactattgtga ggacagtgaa 60 agaggcccca acaccctcac aggtccgggc agccttcctagaggaggtgg cattgaggtg 120 ggcatggagt ttccgggatg cagcggtgaa gggtgcgtgaagccccatga ggaggcggcc 180 cgggaggggg cgggcagagg caagagggct gtgccgggacccaagcgacg gcagcagggg 240 tcagcagagg ggcctgcggc ggggtggacg ctggagcaggagaccagggg agatgtctta 300 gaggataaaa atgagcgggc agatgaagag atactcaggctggcaccagg gaaaggcagg 360 ctcccaatag acagcaaaca cctgaaaccg gtgatcagcagcttcccggt aagatctcag 420 gagctgggcg agggggctgg agcaggcaca ctaagaggcaaaatggcaga gtttaactgg 480 tctatggcct tcaagggacc tgcggctggt catgaagagcgcctcaactc tgtgtccagc 540 agggccaaga agggcattgg ctgggatgtc gctgctgcttctcttcgtgg tgttgaccat 600 ttctcagacc tccccccgcc cctgcaggtc agggaggagttggaggcttg cgcgtttaga 660 gtgcaggtgg ggcagctgag gctctatgag gacgaccagcggacgaaggt ggttgagatc 720 gtccgtcacc cccagtacaa cgagagcctg tctgcccagggcggtgcgga catcgccctg 780 ctgaagctgg aggccccggt gccgctgtct gagctcatccacccggtctc gctcccgtct 840 gcctccctgg acgtgccctc ggggaagacc tgctgggtgaccggctgggg tgtcattgga 900 cgtggagaac tactgccctg gcccctcagc ttgtgggaggcgacggtgaa ggtcaggagc 960 aacgtcctct gtaaccagac ctgtcgccgc cgctttccttccaaccacac tgagcggttt 1020 gagcggctca tcaaggacga catgctgtgt gccggggacgggaaccacgg ctcctggcca 1080 ggcgacaacg ggggccccct cctgtgcagg cggaattgcacctgggtcca ggtggaggtg 1140 gtgagctggg gcaaactctg cggccttcgc ggctatcccggcatgtacac ccgcgtgacg 1200 agctacgtgt cctggatccg ccagccatgc ccctcagctcagacccctgc tgtggtccga 1260 agatttgtgc tccccccaaa tccagatgtt gaagccctaactcccagtgt gatgggatca 1320 ggagcgccgc tgcccccggc ccccgacctg caagaggccgaggtccccat catgaggacc 1380 cgagcttgcg agaggatgta tcacaaaggc cccactgcccacggccaggt caccatcatc 1440 aaggctgcca tgccgtgtgc agggaggaag gggcagggttcctgccaggc cgctctgagg 1500 acggaggacc tcaccccaac cacacccaac acggaggtgtctccacgtgc agaccccagg 1560 ctgagccagc cggaggacat ctggccagag tgggcttggccagttgtggt gggcaccacc 1620 atgctgctgc tgctgctgtt cctggctgtc tcctccctggggagctgtag cactgggagt 1680 ccagctcccg tccccgagaa tgacctggtg ggcattgtggggggccacaa caccccaggg 1740 gaagtggtcg tggcagtggg tgctgaccgc cgctcactgcattttccgga aggacaccga 1800 cccgtccacc taccggattc acaccaggga tgtgtatctgtacgggggcc gggggctgct 1860 gaatgtcagc cagatcgtcg tccacccaac tactctgtcttcttcctggg ggcagacatc 1920 gccctgctga agctggccac cagttccctg gagttcactgacagtgacaa ctgctggaac 1980 acaggctggg gcatggtcgg cttgttggat atgctgccgcctccttaccg cccgcagcag 2040 gtgaaggtcc tcacactgag caatgcagac tgtgagcggcagacctacga tgcttttcct 2100 ggtgctggag acagaaagtt catccaggat gacatgatctgtgccggccg cacgggccgc 2160 cgcacctgga agggtgactc aggcggcccc ctggtctgcaagaagaaggg tacctggctc 2220 caggcgggag tagtgagctg gggattttac agtgatcggcccagcattgg cgtctacaca 2280 cgcccagaga ccagctggca gggtgccaac catgcagacgcccagagacc agctggcagg 2340 gtgccaacca tgcagaggcc cagagacatg ggccagggccaggagtgggt ctgcaggccc 2400 ttcacccacg tcacctgcta cccgacggcc atccccaggcccttcaccca tgtcacctgc 2460 tacctgatgg ctgtccccag caccctcacc cacgtcacctgctacccgac ggccgtcccc 2520 aggcccttca cccatgtcac ctgctacctg atggctgtccccagcaccct cacccacatc 2580 acctgctaca tgatggccgt ccccaggccc tttacccacatcacctgcta cccaatggct 2640 gtccccagca cccttaccca cgtcacctgc cacccgacggccatccccag gcccttcacc 2700 cacatcacct gctacacgat ggccatcccc aggccttcaaccacgccacc tgctacacga 2760 cggccatccc cagcaccctc acccacgtca cctgctacacgatggccgtc cccaggccca 2820 tcacccatgt cacctgctac acgatag 2847 31 1396DNA Homo sapiens misc_feature Incyte ID No 2232483CB1 31 gcccacgtgacgggcgcccg cggaaggcga catgggctcc gctccctggg ccccggtcct 60 gctgctggcgctcgggctgc gcggcctcca ggcgggggcc cgcagggccc cggaccccgg 120 cttccaggagcgcttcttcc agcagcgtct ggaccacttc aacttcgagc gcttcggcaa 180 caggaccttccctcagcgct tcctggtgtc ggacaggttc tgggtccggg gcgaggggcc 240 catcttcttctacactggga acgagggcga cgtgtgggcc ttcgccaaca actcggcctt 300 cgtcgcggagctggcggccg agcggggggc tctactggtc ttcgcggagc accgctacta 360 cgggaagtcgctgccgttcg gtgcgcagtc cacgcagcgc gggcacacgg agctgctgac 420 ggtggagcaggccctggccg acttcgcaga gctgctccgc gcgctacgac gcgacctcgg 480 ggcccaggatgcccccgcca tcgccttcgg tggaagttat ggggggatgc tcagtgccta 540 cctgaggatgaagtatcccc acctggtggc gggggcgctg gcggccagcg cgcccgttct 600 agctgtggcaggcctcggcg actccaacca gttcttccgg gacgtcacgg cgggagccta 660 cgacacggtccgctgggagt tcggcacctg ccagccgctg tcagacgaga aggacctgac 720 ccagctcttcatgttcgccc ggaatgcctt caccgtgctg gccatgatgg actaccccta 780 ccccactgacttcctgggtc ccctccctgc caaccccgtc aaggtgggct gtgatcggct 840 gctgagtgaggcccagagga tcacggggct gcgagcactg gcagggctgg tctacaacgc 900 ctcgggctccgagcactgct acgacatcta ccggctctac cacagctgtg ctgaccccac 960 tggctgcggcaccggccccg acgccagggc ctgggactac caggcctgca ccgagatcaa 1020 cctgaccttcgccagcaaca atgtgaccga tatgttcccc gacctgccct tcactgacga 1080 gctccggccaagcgatctca gagccgccag caacatcatc ttctccaacg ggaacctgga 1140 cccctgtggcaggggcggga ttcggaggaa cctgagtgcc tcagtcatcg ccgtcaccat 1200 ccaggggggagcgcaccacc tcgacctcag agcctcccac ccagaagatc ctgcttccgt 1260 ggttgaggcgcggaagctgg aggccaccat catcggcgag tgcgtaaagg cagccaggcg 1320 tgagcagcagccagctctgc gttggggggc ccagatcagc ctctgagcac aggactggag 1380 gggtctcagggctcta 1396 32 1853 DNA Homo sapiens misc_feature Incyte ID No7481712CB1 32 tcagagctgc ctgcgagtcc tgcctgctca gggctccttc agcctccactcatgcggtgc 60 ctcctggcct ccccgcctgg ccctcagctg ggcctctgct ccactctgccccttccaagt 120 gacacagcct tccagggatc cttctccgca gcttcctccg cccctccctgtgtttttcta 180 gggaccaggt tcttcgagtc ctggccaaag atgagaagca gctttcacttctcggggatc 240 tggagggcct gaaaccccag aaggtggact tctggcgtgg cccagccaggcccagcctcc 300 ctgtggatat gagagttcct ttctccgaac tgaaagacat caaagcttatctggagtctc 360 atggacttgc ttacagcatc atgataaagg acatccaggt gaagccctgccccagctggg 420 accctgcctt ccgccttcct ttctggttgg ggcccaacat ggaggagatgttctcggggc 480 taaaagtgga catgtggttt ctgggtctcc atcagcgtgt ttgtgaacatgctgtggaag 540 gaacaggctg cccaccccct cacttcacca aagcttccct cgacaatgtcacacgcaact 600 tccagatcca acccgatggc cgactctcaa tgttcctctt ccaacagcacaactggtcac 660 tctctccttc ctggagcctg tctcttcccc tggcatccag gacttctgtgttctgtctcc 720 agccagcacc tcctctcctg gatccaaccg cctactcagt gtttccacctgggggtgcaa 780 tgggcatctc caactttcca gccccaggaa tggagcaaac gctggtgcattttccaggcc 840 aaggcagatt cctgttcctg gaagtggggc cagctgtgct gctggatgaggaaagacagg 900 ccatggcgaa atcccgccgg ctggagcgca gcaccaacag cttcagttactcatcatacc 960 acaccctgga ggagatatat agctggattg acaactttgt aatggagcattccgatattg 1020 tctcaaaaat tcagattggc aacagctttg aaaaccagtc cattcttgtcctgaagttca 1080 gcactggagg ttctcggcac ccagccatct ggattgacac tggaattcactcccgggagt 1140 ggatcaccca tgccaccggc atctggactg ccaataagat tgtcagtgattatggcaaag 1200 accgtgtcct gacagacata ctgaatgcca tggacatctt catagagctcgtcacaaacc 1260 ctgatgggtt tgcttttacc cacagcatga accgcttatg gcggaagaacaagtccatca 1320 gacctggaat cttctgcatc ggcgtggatc tcaacaggaa ctggaagtcgggttttggag 1380 gaaatggttc taacagcaac ccctgctcag aaacttatca cgggccctcccctcagtcgg 1440 agccggaggt ggctgccata gtgaacttca tcacagccca tggcaacttcaaggctctga 1500 tctccatcca cagctactct cagatgctta tgtaccctta cggccgattgctggagcccg 1560 tttcaaatca gagggagttg tacgatcttg ccaaggatgc ggtggaggccttgtataagg 1620 tccatgggat cgagtacatt tttggcagca tcagcaccac cctctatgtggccagtggga 1680 tcaccgtcga ctgggcctat gacagtggca tcaagtacgc cttcagctttgagctccggg 1740 acactgggca gtatggcttc ctgctgccgg ccacacagat catccccacggcccaggaga 1800 cgtggatggc gcttcggacc atcatggagc acaccctgaa tcacccctactag 1853 33 3344 DNA Homo sapiens misc_feature Incyte ID No 8213480CB133 atgggctgga ggccccggag agctcggggg accccgttgc tgctgctgct actactgctg 60ctgctctggc cagtgccagg cgccggggtg cttcaaggac atatccctgg gcagccagtc 120accccgcact gggtcctgga tggacaaccc tggcgcaccg tcagcctgga ggagccggtc 180tcgaagccag acatggggct ggtggccctg gaggctgaag gccaggagct cctgcttgag 240ctggagaaga accacaggct gctggcccca ggatacatag aaacccacta cggcccagat 300gggcagccag tggtgctggc ccccaaccac acggatcatt gccactacca agggcgagta 360aggggcttcc ccgactcctg ggtagtcctc tgcacctgct ctgggatgag tggcctgatc 420accctcagca ggaatgccag ctattatctg cgtccctggc caccccgggg ctccaaggac 480ttctcaaccc acgagatctt tcggatggag cagctgctca cctggaaagg aacctgtggc 540cacagggatc ctgggaacaa agcgggcatg accagccttc ctggtggtcc ccagagcagg 600ggcaggcgag aagcgcgcag gacccggaag tacctggaac tgtacattgt ggcagaccac 660accctgttct tgactcggca ccgaaacttg aaccacacca aacagcgtct cctggaagtc 720gccaactacg tggaccagct tctcaggact ctggacattc aggtggcgct gaccggcctg 780gaggtgtgga ccgagcggga ccgcagccgc gtcacgcagg acgccaacgc cacgctctgg 840gccttcctgc agtggcgccg gggactgtgg gcgcagcggc cccacgactc cgcgcagctg 900ctcacgggcc gcgccttcca gggcgccaca gtgggcctgg cgcccgtcga gggcatgtgc 960cgcgccgaga gctcgggagg cgtgagcacg gaccactcgg agctccccat cggcgccgca 1020gccaccatgg cccatgagat cggccacagc ctcggcctca gccacgaccc cgacggctgc 1080tgcgtggagg ctgcggccga gtccggaggc tgcgtcatgg ctgcggccac cgggcacccg 1140tttccgcgcg tgttcagcgc ctgcagccgc cgccagctgc gcgccttctt ccgcaagggg 1200ggcggcgctt gcctctccaa tgccccggac cccggactcc cggtgccgcc ggcgctctgc 1260gggaacggct tcgtggaagc gggcgaggag tgtgactgcg gccctggcca ggagtgccgc 1320gacctctgct gctttgctca caactgctcg ctgcgcccgg gggcccagtg cgcccacggg 1380gactgctgcg tgcgctgcct gctgaagccg gctggagcgc tgtgccgcca ggccatgggt 1440gactgtgacc tccctgagtt ttgcacgggc acctcctccc actgtccccc agacgtttac 1500ctactggacg gctcaccctg tgccaggggc agtggctact gctgggatgg cgcatgtccc 1560acgctggagc agcagtgcca gcagctctgg gggcctggct cccacccagc tcccgaggcc 1620tgtttccagg tggtgaactc tgcgggagat gctcatggaa actgcggcca ggacagcgag 1680ggccacttcc tgccctgtgc agggagggat gccctgtgtg ggaagctgca gtgccagggt 1740ggaaagccca gcctgctcgc accgcacatg gtgccagtgg actctaccgt tcacctagat 1800ggccaggaag tgacttgtcg gggagccttg gcactcccca gtgcccagct ggacctgctt 1860ggcctgggcc tggtagagcc aggcacccag tgtggaccta gaatggtttg caatagcaac 1920cataactgcc actgtgctcc aggctgggct ccacccttct gtgacaagcc aggctttggt 1980ggcagcatgg acagtggccc tgtgcaggct gaaaaccatg acaccttcct gctggccatg 2040ctcctcagcg tcctgctgcc tctgctccca ggcgccggcc tggcctggtg ttgctaccga 2100ctcccaggag cccatctgca gcgatgcagc tggggctgca gaagggaccc tgcgtgcagt 2160ggccccaaag atggcccaca cagggaccac cccctgggcg gcgttcaccc cacggagttg 2220ggccccacag ccactggaca gtcctggccc ctggaccctg agaactctca tgagcccagc 2280agccaccctg agaagcctct gccagcagtc tcgcctgacc cccaagcaga tcaagtccag 2340atgccaagat cctgcctctg gtgagaggta gctcctaaaa tgaacagatt taaagacagg 2400tggccactga cagccactcc aggaacttga actgcagggg cagagccagt gaatcaccgg 2460acctccagca cctgcaggca gcttggaagt ttcttccccg agtggagctt cgacccaccc 2520actccaggaa cccagagcca cattagaagt tcctgagggc tggagaacac tgctgggcac 2580actctccagc tcaataaacc atcagtccca gaagcaaagg tcacacagcc cctgacctcc 2640ctcaccagtg gaggctgggt agtgctggcc atcccaaaag ggctctgtcc tgggagtctg 2700gtgtgtctcc tacatgcaat ttccacggac ccagctctgt ggagggcatg actgctggcc 2760agaagctagt ggtcctgggg ccctatggtt cgactgagtc cacactcccc tgcagcctgg 2820ctggcctctg caaacaaaca taattttggg gaccttcctt cctgtttctt cccaccctgt 2880cttctcccct aggtggttcc tgagccccca cccccaatcc cagtgctaca cctgaggttc 2940tggagctcag aatctgacag cctctccccc attctgtgtg tgtcgggggg acagagggaa 3000ccatttaaga aaagatacca aagtagaagt caaaagaaag acatgttggc tataggcgtg 3060gtggctcatg cctataatcc cagcactttg ggaagccggg gtaggaggat caccagaggc 3120cagcaggtcc acaccagcct gggcaacaca gcaagacacc gcatctacag aaaaatttta 3180aaattagctg ggcgtggtgg tgtgtacctg taggcctagc tgctcaggag gctgaagcag 3240gaggatcact tgagcctgag ttcaacactg cagtgagcta tggtggcacc actgcactcc 3300agcctgggtg acagagcaag accatgtctc taaaataaat ttta 3344 34 3389 DNA Homosapiens misc_feature Incyte ID No 7478405CB1 34 cggccgcgga aagaatgcgcgccgcccgtg cgctccgcct gccgcgtctg gccacccgca 60 gccgccgcgt ccgcacctgaccatggagtg cgccctcctg ctcgcgtgtg ccttcccggc 120 tgcgggttcg ggcccgccgaggggcctggc gggactgggg cgcgtggcca aggcgctcca 180 gctgtgctgc ctctgctgtgcgtcggtcgc cgcggcctta gccagtgaca gcagcagcgg 240 cgccagcgga ttaaatgatgattacgtctt tgtcacgcca gtagaagtag actcagccgg 300 gtcatatatt tcacacgacattttgcacaa cggcaggaaa aagcgatcgg cgcagaatgc 360 cagaagctcc ctgcactaccgattttcagc atttggacag gaactgcact tagaacttaa 420 gccctcggcg attttgagcagtcactttat tgtccaggta cttggaaaag atggtgcttc 480 agagactcag aaacccgaggtgcagcaatg cttctatcag ggatttatca gaaatgacag 540 ctcctcctct gtcgctgtgtctacgtgtgc tggcttgtca ggtttaataa ggacacgaaa 600 aaatgaattc ctcatctcgccattacctca gcttctggcc caggaacaca actacagctc 660 ccctgcgggt caccatcctcacgtactgta caaaaggaca gcagaggaga agatccagcg 720 gtaccgtggc taccccggctctggccggaa ttatcctggt tactccccaa gtcacattcc 780 ccatgcatct cagagtcgagagacagagta tcaccatcga aggttgcaaa agcagcattt 840 ttgtggacga cgcaagaaatatgctcccaa gcctcccaca gaggacacct atctaaggtt 900 tgatgaatat gggagctctgggcgacccag aagatcagct ggaaaatcac aaaagggcct 960 caatgtggaa accctcgtggtggcagacaa gaaaatggtg gaaaagcatg gcaagggaaa 1020 tgtcaccaca tacattctcacagtaatgaa catggtttct ggcctattta aagatgggac 1080 tattggaagt gacataaacgtggttgtggt gagcctaatt cttctggaac aagaacctgg 1140 aggattattg atcaaccatcatgcagacca gtctctgaat agtttttgtc aatggcagtc 1200 tgccctcatt ggaaagaatggcaagagaca tgatcatgcc atcttactaa caggatttga 1260 tatttgttct tggaagaatgaaccatgtga cactctaggg tttgccccca tcagtggaat 1320 gtgctctaag taccgaagttgtaccatcaa tgaggacaca ggacttggcc ttgccttcac 1380 catcgctcat gagtcagggcacaactttgg tatgattcac gacggagaag ggaatccctg 1440 cagaaaggct gaaggcaatatcatgtctcc cacactgacc ggaaacaatg gagtgttttc 1500 atggtcttcc tgcagccgccagtatctcaa gaaattcctc agcacacctc aggcggggtg 1560 tctagtggat gagcccaagcaagcaggaca gtataaatat ccggacaaac taccaggaca 1620 gatttatgat gctgacacacagtgtaaatg gcaatttgga gcaaaagcca agttatgcag 1680 ccttggtttt gtgaaggatatttgcaaatc actttggtgc caccgagtag gccacaggtg 1740 tgagaccaag tttatgcccgcagcagaagg gaccgtttgt ggcttgagta tgtggtgtcg 1800 gcaaggccag tgcgtaaagtttggggagct cgggccccgg cccatccacg gccagtggtc 1860 cgcctggtcg aagtggtcagaatgttcccg gacatgtggt ggaggagtca agttccagga 1920 gagacactgc aataaccccaagcctcagta tggtggcata ttctgtccag gttctagccg 1980 tatttatcag ctgtgcaatattaacccttg caatgaaaat agcttggatt ttcgggctca 2040 acagtgtgca gaatataacagcaaaccttt ccgtggatgg ttctaccagt ggaaacccta 2100 tacaaaagtg gaagaggaagatcgatgcaa actgtactgc aaggctgaga actttgaatt 2160 tttttttgca atgtccggcaaagtgaaaga tggaactccc tgctccccaa acaaaaatga 2220 tgtttgtatt gacggggtttgtgaactagt gggatgtgat catgaactag gctctaaagc 2280 agtttcagat gcttgtggcgtttgcaaagg tgataattca acttgcaagt tttataaagg 2340 cctgtacctc aaccagcataaagcaaatga atattatccg gtggtcatca ttccagctgg 2400 cgcccgaagc atcgaaatccaggagctgca ggtttcctcc agttacctcg cagttcgaag 2460 cctcagtcaa aagtattacctcaccggggg ctggagcatc gactggcctg gggagttccc 2520 cttcgctggg accacgtttgaataccagcg ctctttcaac cgcccggaac gtctgtacgc 2580 gccagggccc acaaatgagacgctggtctt tgaaattctg atgcaaggca aaaatccagg 2640 gatagcttgg aagtatgcacttcccaaggt catgaatgga actccaccag ccacaaaaag 2700 acctgcctat acctggagtatcgtgcagtc agagtgctcc gtctcctgtg gtggaggtta 2760 cataaatgta aaggccatttgcttgcgaga tcaaaatact caagtcaatt cctcattctg 2820 cagtgcaaaa accaagccagtaactgagcc caaaatctgc aacgctttct cctgcccggc 2880 ttactggatg ccaggtgaatggagtacatg cagcaaggcc tgtgctggag gccagcagag 2940 ccgaaagatc cagtgtgtgcaaaagaagcc cttccaaaag gaggaagcag tgttgcattc 3000 tctctgtcca gtgagcacacccactcaggt ccaagcctgc aacagccatg cctgccctcc 3060 acaatggagc cttggaccctggtctcagtg ttccaagacc tgtggacgag gggtgaggaa 3120 gcgtgaactc ctctgcaagggctctgccgc agaaaccctc cccgagagcc agtgtaccag 3180 tctccccaga cctgagctgcaggagggctg tgtgcttgga cgatgcccca agaacagccg 3240 gctacagtgg gtcgcttcttcgtggagcga ggtattgatt agaagtcact gctgggtcag 3300 gagattgaga ccatcctggctaacacagtg aaaccctgtc tctactaaaa atacaaaaaa 3360 ttagccaggc aaggtggcaggcgcctgta 3389

What is claimed is:
 1. An isolated polypeptide selected from the groupconsisting of: a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO: 1-17, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO: 1-17, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ ID NO:1-17.
 2. An isolated polypeptide of claim 1 selected from the groupconsisting of SEQ ID NO: 1-17.
 3. An isolated polynucleotide encoding apolypeptide of claim
 1. 4. An isolated polynucleotide encoding apolypeptide of claim
 2. 5. An isolated polynucleotide of claim 4selected from the group consisting of SEQ ID No: 18-34.
 6. A recombinantpolynucleotide comprising a promoter sequence operably linked to apolynucleotide of claim
 3. 7. A cell transformed with a recombinantpolynucleotide of claim
 6. 8. A transgenic organism comprising arecombinant polynucleotide of claim
 6. 9. A method of producing apolypeptide of claim 1, the method comprising: a) culturing a cell underconditions suitable for expression of the polypeptide, wherein said cellis transformed with a recombinant polynucleotide, and said recombinantpolynucleotide comprises a promoter sequence operably linked to apolynucleotide encoding the polypeptide of claim 1, and b) recoveringthe polypeptide so expressed.
 10. A method of claim 9, wherein thepolypeptide has an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17.
 11. An isolated antibody whichspecifically binds to a polypeptide of claim
 1. 12. An isolatedpolynucleotide selected from the group consisting of: a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO: 18-34, b) a polynucleotide comprising anaturally occuring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ ID NO:18-34, c) a polynucleotide complementary to a polynucleotide of a), d) apolynucleotide complementary to a polynucleotide of b), and e) an RNAequivalent of a)-d).
 13. An isolated polynucleotide comprising at least60 contiguous nucleotides of a polynucleotide of claim
 12. 14. A methodof detecting a target polynucleotide in a sample, said targetpolynucleotide having a sequence of a polynucleotide of claim 12, themethod comprising: a) hybridizing the sample with a probe comprising atleast 20 contiguous nucleotides comprising a sequence complementary tosaid target polynucleotide in the sample, and which probe specificallyhybridizes to said target polynucleotide, under conditions whereby ahybridization complex is formed between said probe and said targetpolynucleotide or fragments thereof, and b) detecting the presence orabsence of said hybridization complex, and, optionally, if present, theamount thereof.
 15. A method of claim 14, wherein the probe comprises atleast 60 contiguous nucleotides.
 16. A method of detecting a targetpolynucleotide in a sample, said target polynucleotide having a sequenceof a polynucleotide of claim 12, the method comprising: a) amplifyingsaid target polynucleotide or fragment thereof using polymerase chainreaction amplification, and b) detecting the presence or absence of saidamplified target polynucleotide or fragment thereof, and, optionally, ifpresent, the amount thereof.
 17. A composition comprising a polypeptideof claim 1 and a pharmaceutically acceptable excipient.
 18. Acomposition of claim 17, wherein the polypeptide has an amino acidsequence selected from the group consisting of SEQ ID NO: 1-17.
 19. Amethod for treating a disease or condition associated with decreasedexpression of functional PRTS, comprising administering to a patient inneed of such treatment the composition of claim
 17. 20. A method ofscreening a compound for effectiveness as an agonist of a polypeptide ofclaim 1, the method comprising: a) exposing a sample comprising apolypeptide of claim 1 to a compound, and b) detecting agonist activityin the sample.
 21. A composition comprising an agonist compoundidentified by a method of claim 20 and a pharmaceutically acceptableexcipient.
 22. A method for treating a disease or condition associatedwith decreased expression of functional PRTS, comprising administeringto a patient in need of such treatment a composition of claim
 21. 23. Amethod of screening a compound for effectiveness as an antagonist of apolypeptide of claim 1, the method comprising: a) exposing a samplecomprising a polypeptide of claim 1 to a compound, and b) detectingantagonist activity in the sample.
 24. A composition comprising anantagonist compound identified by a method of claim 23 and apharmaceutically acceptable excipient.
 25. A method for treating adisease or condition associated with overexpression of functional PRTS,comprising administering to a patient in need of such treatment acomposition of claim
 24. 26. A method of screening for a compound thatspecifically binds to the polypeptide of claim 1, the method comprising:a) combining the polypeptide of claim 1 with at least one test compoundunder suitable conditions, and b) detecting binding of the polypeptideof claim 1 to the test compound, thereby identifying a compound thatspecifically binds to the polypeptide of claim
 1. 27. A method ofscreening for a compound that modulates the activity of the polypeptideof claim 1, the method comprising: a) combining the polypeptide of claim1 with at least one test compound under conditions permissive for theactivity of the polypeptide of claim 1, b) assessing the activity of thepolypeptide of claim 1 in the presence of the test compound, and c)comparing the activity of the polypeptide of claim 1 in the presence ofthe test compound with the activity of the polypeptide of claim 1 in theabsence of the test compound, wherein a change in the activity of thepolypeptide of claim 1 in the presence of the test compound isindicative of a compound that modulates the activity of the polypeptideof claim
 1. 28. A method ,of screening a compound for effectiveness inaltering expression of a target polynucleotide, wherein said targetpolynucleotide comprises a sequence of claim 5, the method comprising:a) exposing a sample comprising the target polynucleotide to a compound,under conditions suitable for the expression of the targetpolynucleotide, b) detecting altered expression of the targetpolynucleotide, and c) comparing the expression of the targetpolynucleotide in the presence of varying amounts of the compound and inthe absence of the compound.
 29. A method of assessing toxicity of atest compound, the method comprising: a) treating a biological samplecontaining nucleic acids with the test compound, b) hybridizing thenucleic acids of the treated biological sample with a probe comprisingat least 20 contiguous nucleotides of a polynucleotide of claim 12 underconditions whereby a specific hybridization complex is formed betweensaid probe and a target polynucleotide in the biological sample, saidtarget polynucleotide comprising a polynucleotide sequence of apolynucleotide of claim 12 or fragment thereof, c) quantifying theamount of hybridization complex, and d) comparing the amount ofhybridization complex in the treated biological sample with the amountof hybridization complex in an untreated biological sample, wherein adifference in the amount of hybridization complex in the treatedbiological sample is indicative of toxicity of the test compound.
 30. Adiagnostic test for a condition or disease associated with theexpression of PRTS in a biological sample, the method comprising: a)combining the biological sample with an antibody of claim 12, underconditions suitable for the antibody to bind the polypeptide and form anantibody polypeptide complex, and b) detecting the complex, wherein thepresence of the complex correlates with the presence of the polypeptidein the biological sample.
 31. The antibody of claim 12, wherein theantibody is: a) a chimeric antibody, b) a single chain antibody, c) aFab fragment, d) a F(ab′)₂ fragment, or e) a humanized antibody.
 32. Acomposition comprising an antibody of claim 12 and an acceptableexcipient.
 33. A method of diagnosing a condition or disease associatedwith the expression of PRTS in a subject, comprising administering tosaid subject an effective amount of the composition of claim
 32. 34. Acomposition of claim 32, wherein the antibody is labeled.
 35. A methodof diagnosing a condition or disease associated with the expression ofPRTS in a subject, comprising administering to said subject an effectiveamount of the composition of claim
 34. 36. A method of preparing apolyclonal antibody with the specificity of the antibody of claim 12,the method comprising: a) immunizing an animal with a polypeptide havingan amino acid sequence selected from the group consisting of SEQ ID NO:1-17, or an immunogenic fragment thereof, under conditions to elicit anantibody response, b) isolating antibodies from said animal, and c)screening the isolated antibodies with the polypeptide, therebyidentifying a polyclonal antibody which binds specifically to apolypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO: 1-17.
 37. A polyclonal antibody produced by amethod of claim
 36. 38. A composition comprising the polyclonal antibodyof claim 37 and a suitable carrier.
 39. A method of making a monoclonalantibody with the specificity of the antibody of claim 12, the methodcomprising: a) immunizing an animal with a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO: 1-17, oran immunogenic fragment thereof, under conditions to elicit an antibodyresponse, b) isolating antibody producing cells from the animal, c)fusing the antibody producing cells with immortalized cells to formmonoclonal antibody-producing hybridoma cells, d) culturing thehybridoma cells, and e) isolating from the culture monoclonal antibodywhich binds specifically to a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17.
 40. A monoclonalantibody produced by a method of claim
 39. 41. A composition comprisingthe monoclonal antibody of claim 40 and a suitable carrier.
 42. Theantibody of claim 12, wherein the antibody is produced by screening aFab expression library.
 43. The antibody of claim 12, wherein theantibody is produced by screening a recombinant immunoglobulin library.44. A method of detecting a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17 in a sample, themethod comprising: a) incubating the antibody of claim 12 with a sampleunder conditions to allow specific binding of the antibody and thepolypeptide, and b) detecting specific binding, wherein specific bindingindicates the presence of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17 in the sample. 45.A method of purifying a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17 from a sample, themethod comprising: a) incubating the antibody of claim 12 with a sampleunder conditions to allow specific binding of the antibody and thepolypeptide, and b) separating the antibody from the sample andobtaining the purified polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO: 1-17.
 46. A microarraywherein at least one element of the microarray is a polynucleotide ofclaim
 13. 47. A method of generating a transcript image of a samplewhich contains polynucleotides, the method comprising: a) labeling thepolynucleotides of the sample, b) contacting the elements of themicroarray of claim 46 with the labeled polynucleotides of the sampleunder conditions suitable for the formation of a hybridization complex,and c) quantifying the expression of the polynucleotides in the sample.48. An array comprising different nucleotide molecules affixed indistinct physical locations on a solid substrate, wherein at least oneof said nucleotide molecules comprises a first oligonucleotide orpolynucleotide sequence specifically hybridizable with at least 30contiguous nucleotides of a target polynucleotide, and wherein saidtarget polynucleotide is a polynucleotide of claim
 12. 49. An array ofclaim 48, wherein said first oligonucleotide or polynucleotide sequenceis completely complementary to at least 30 contiguous nucleotides ofsaid target polynucleotide.
 50. An array of claim 48, wherein said firstoligonucleotide or polynucleotide sequence is completely complementaryto at least 60 contiguous nucleotides of said target polynucleotide. 51.An array of claim 48, wherein said first oligonucleotide orpolynucleotide sequence is completely complementary to said targetpolynucleotide.
 52. An array of claim 48, which is a microarray.
 53. Anarray of claim 48, further comprising said target polynucleotidehybridized to a nucleotide molecule comprising said firstoligonucleotide or polynucleotide sequence.
 54. An array of claim 48,wherein a linker joins at least one of said nucleotide molecules to saidsolid substrate.
 55. An array of claim 48, wherein each distinctphysical location on the substrate contains multiple nucleotidemolecules, and the multiple nucleotide molecules at any single distinctphysical location have the same sequence, and each distinct physicallocation on the substrate contains nucleotide molecules having asequence which differs from the sequence of nucleotide molecules atanother distinct physical location on the substrate.
 56. A polypeptideof claim 1, comprising the amino acid sequence of SEQ ID NO:
 1. 57. Apolypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:2.
 58. A polypeptide of claim 1, comprising the amino acid sequence ofSEQ ID NO:
 3. 59. A polypeptide of claim 1, comprising the amino acidsequence of SEQ ID NO:
 4. 60. A polypeptide of claim 1, comprising theamino acid sequence of SEQ ID NO:
 5. 61. A polypeptide of claim 1,comprising the amino acid sequence of SEQ ID NO:
 6. 62. A polypeptide ofclaim 1, comprising the amino acid sequence of SEQ ID NO:
 7. 63. Apolypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:8.
 64. A polypeptide of claim 1, comprising the amino acid sequence ofSEQ ID NO:
 9. 65. A polypeptide of claim 1, comprising the amino acidsequence of SEQ ID NO:
 10. 66. A polypeptide of claim 1, comprising theamino acid sequence of SEQ ID NO:
 11. 67. A polypeptide of claim 1,comprising the amino acid sequence of SEQ ID NO:
 12. 68. A polypeptideof claim 1, comprising the amino acid sequence of SEQ ID NO:
 13. 69. Apolypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:14.
 70. A polypeptide of claim 1, comprising the amino acid sequence ofSEQ ID NO:
 15. 71. A polypeptide of claim 1, comprising the amino acidsequence of SEQ ID NO:
 16. 72. A polypeptide of claim 1, comprising theamino acid sequence of SEQ ID NO:
 17. 73. A polynucleotide of claim 12,comprising the polynucleotide sequence of SEQ ID NO:
 18. 74. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:
 19. 75. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO: 20
 76. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:
 21. 77. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:
 22. 78. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:
 23. 79. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:
 24. 80. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:
 25. 81. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:
 26. 82. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:
 27. 83. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:
 28. 84. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:
 29. 85. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:
 30. 86. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:
 31. 87. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:
 32. 88. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:
 33. 89. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO: 34.